2017 ACTL2131 Exercises

UNSW Business School
School of Risk and Actuarial Studies
ACTL2131
Probability and Mathematical Statistics
Exercises
S1 2017
January 5, 2017
Contents
Schedule of Tutorial Exercises 1
1 Probability Theory 2
Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Exercise 1.1 [wk01Q1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Exercise 1.2 [wk01Q2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Exercise 1.3 [wk01Q3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Exercise 1.4 [wk01Q4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Exercise 1.5 [wk01Q5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Exercise 1.6 [wk01Q6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Exercise 1.7 [wk01Q7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Exercise 1.8 [wk01Q8] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1 Mathematical Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Exercise 1.9 [wk01Q9] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Exercise 1.10 [wk01Q10] . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Exercise 1.11 [wk01Q11] . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Exercise 1.12 [wk01Q12] . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Exercise 1.13 [wk01Q13] . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Exercise 1.14 [wk01Q14] . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Exercise 1.15 [wk01Q15] . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Exercise 1.16 [wk01Q16] . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Exercise 1.17 [wk01Q17] . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Univariate Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Exercise 1.18 [wk02Q1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Exercise 1.19 [wk02Q2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Exercise 1.20 [wk02Q3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Exercise 1.21 [wk02Q4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Exercise 1.22 [wk02Q5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Exercise 1.23 [wk02Q6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
i
ACTL2131 Probability and Mathematical Statistics, S1 2017 Exercises
Exercise 1.24 [wk02Q7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Exercise 1.25 [wk02Q8] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Joint and Multivariate Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Exercise 1.26 [wk03Q4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Exercise 1.27 [wk03Q5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Exercise 1.28 [wk03Q6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Exercise 1.29 [wk03Q8] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Exercise 1.30 [wk03Q9] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Exercise 1.31 [wk03Q10] . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Exercise 1.32 [wk03Q11] . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Exercise 1.33 [wk03Q12] . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Sampling and Summarising Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Exercise 1.34 [wk03Q1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Exercise 1.35 [wk03Q2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Exercise 1.36 [wk03Q3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Exercise 1.37 [wk03Q7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.5 Functions of Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Exercise 1.38 [wk04Q1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Exercise 1.39 [wk04Q2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Exercise 1.40 [wk04Q3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Exercise 1.41 [wk04Q4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Exercise 1.42 [wk04Q5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Exercise 1.43 [wk05Q14] . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Exercise 1.44 [wk05Q15] . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Exercise 1.45 [wk05Q16] . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Exercise 1.46 [wk04Q6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Exercise 1.1 [wk01Q1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Exercise 1.2 [wk01Q2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Exercise 1.3 [wk01Q3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Exercise 1.4 [wk01Q4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Exercise 1.5 [wk01Q5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Exercise 1.6 [wk01Q6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Exercise 1.7 [wk01Q7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Exercise 1.8 [wk01Q8] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Exercise 1.9 [wk01Q9] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Exercise 1.10 [wk01Q10] . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
ii
Exercise 1.11 [wk01Q11] . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

Exercise 1.12 [wk01Q12] . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Exercise 1.13 [wk01Q13] . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Exercise 1.14 [wk01Q14] . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Exercise 1.15 [wk01Q15] . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Exercise 1.16 [wk01Q16] . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Exercise 1.17 [wk01Q17] . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Exercise 1.18 [wk02Q1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Exercise 1.19 [wk02Q2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Exercise 1.20 [wk02Q3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Exercise 1.21 [wk02Q4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Exercise 1.22 [wk02Q5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Exercise 1.23 [wk02Q6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Exercise 1.24 [wk02Q7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Exercise 1.25 [wk02Q8] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Exercise 1.26 [wk03Q4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Exercise 1.27 [wk03Q5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Exercise 1.28 [wk03Q6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Exercise 1.29 [wk03Q8] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Exercise 1.30 [wk03Q9] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Exercise 1.31 [wk03Q10] . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Exercise 1.32 [wk03Q11] . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Exercise 1.33 [wk03Q12] . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Exercise 1.34 [wk03Q1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Exercise 1.35 [wk03Q2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Exercise 1.36 [wk03Q3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Exercise 1.37 [wk03Q7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Exercise 1.38 [wk04Q1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Exercise 1.39 [wk04Q2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Exercise 1.40 [wk04Q3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Exercise 1.41 [wk04Q4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Exercise 1.42 [wk04Q5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Exercise 1.43 [wk05Q14] . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Exercise 1.44 [wk05Q15] . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Exercise 1.45 [wk05Q16] . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Exercise 1.46 [wk04Q6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
iii
2 Parameter Estimation 58
2.1 Estimation Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Exercise 2.1 [wk05Q2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Exercise 2.2 [wk05Q5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Exercise 2.3 [wk05Q6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Exercise 2.4 [wk05Q9] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Exercise 2.5 [wk05Q12] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Exercise 2.6 [wk05Q13] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2.2 Limit Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Exercise 2.7 [wk05Q1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Exercise 2.8 [wk05Q3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Exercise 2.9 [wk05Q4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Exercise 2.10 [wk05Q7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Exercise 2.11 [wk05Q8] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Exercise 2.12 [wk05Q10] . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Exercise 2.13 [wk05Q11] . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2.3 Evaluating Estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Exercise 2.14 [wk06Q1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Exercise 2.15 [wk06Q2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Exercise 2.16 [wk06Q3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Exercise 2.17 [wk06Q4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Exercise 2.18 [wk06Q5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Exercise 2.19 [wk06Q6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Exercise 2.20 [wk06Q7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Exercise 2.21 [wk06Q8] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Exercise 2.22 [wk06Q9] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Exercise 2.23 [wk06Q10] . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Exercise 2.24 [wk06Q11] . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Exercise 2.25 [wk06Q12] . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Exercise 2.26 [wk06Q13] . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Exercise 2.27 [wk06Q14] . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Exercise 2.1 [wk05Q2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Exercise 2.2 [wk05Q5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Exercise 2.3 [wk05Q6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Exercise 2.4 [wk05Q9] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Exercise 2.5 [wk05Q12] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
iv
Exercise 2.6 [wk05Q13] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

Exercise 2.7 [wk05Q1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Exercise 2.8 [wk05Q3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Exercise 2.9 [wk05Q4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Exercise 2.10 [wk05Q7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Exercise 2.11 [wk05Q8] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Exercise 2.12 [wk05Q10] . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Exercise 2.13 [wk05Q11] . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Exercise 2.14 [wk06Q1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Exercise 2.15 [wk06Q2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Exercise 2.16 [wk06Q3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Exercise 2.17 [wk06Q4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Exercise 2.18 [wk06Q5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Exercise 2.19 [wk06Q6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Exercise 2.20 [wk06Q7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Exercise 2.21 [wk06Q8] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Exercise 2.22 [wk06Q9] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Exercise 2.23 [wk06Q10] . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Exercise 2.24 [wk06Q11] . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Exercise 2.25 [wk06Q12] . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Exercise 2.26 [wk06Q13] . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Exercise 2.27 [wk06Q14] . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3 Hypothesis Test 89
3.1 Statistical test procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Exercise 3.1 [wk07Q1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Exercise 3.2 [wk07Q2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Exercise 3.3 [wk07Q3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Exercise 3.4 [wk07Q4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Exercise 3.5 [wk07Q5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Exercise 3.6 [wk10Q10] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.2 Properties of the hypothesis testing . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Exercise 3.7 [wk08Q1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Exercise 3.8 [wk08Q2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Exercise 3.9 [wk08Q3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Exercise 3.10 [wk08Q4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Exercise 3.11 [wk08Q5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
v
Exercise 3.12 [wk08Q7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

Exercise 3.13 [wk08Q8] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Exercise 3.14 [wk07Q6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Exercise 3.15 [wk07Q7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Exercise 3.16 [wk07Q8] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3.3 Non-parametric Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
Exercise 3.17 [wk09Q1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
Exercise 3.18 [wk09Q2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
Exercise 3.19 [wk09Q3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
Exercise 3.20 [wk09Q4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Exercise 3.21 [wk09Q5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Exercise 3.22 [wk09Q6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Exercise 3.23 [wk09Q7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Exercise 3.1 [wk07Q1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Exercise 3.2 [wk07Q2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Exercise 3.3 [wk07Q3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Exercise 3.4 [wk07Q4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
Exercise 3.5 [wk07Q5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Exercise 3.6 [wk10Q10] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
Exercise 3.7 [wk08Q1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
Exercise 3.8 [wk08Q2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
Exercise 3.9 [wk08Q3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Exercise 3.10 [wk08Q4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
Exercise 3.11 [wk08Q5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
Exercise 3.12 [wk08Q7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
Exercise 3.13 [wk08Q8] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
Exercise 3.14 [wk07Q6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Exercise 3.15 [wk07Q7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Exercise 3.16 [wk07Q8] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Exercise 3.17 [wk09Q1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Exercise 3.18 [wk09Q2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Exercise 3.19 [wk09Q3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Exercise 3.20 [wk09Q4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Exercise 3.21 [wk09Q5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Exercise 3.22 [wk09Q6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Exercise 3.23 [wk09Q7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
vi
4 Linear Regression 122

4.1 Simple Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Exercise 4.1 [wk10Q1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Exercise 4.2 [wk10Q2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Exercise 4.3 [wk10Q3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Exercise 4.4 [wk10Q4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Exercise 4.5 [wk10Q5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Exercise 4.6 [wk10Q6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Exercise 4.7 [wk10Q7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Exercise 4.8 [wk10Q8] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
Exercise 4.9 [wk10Q9] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
Exercise 4.10 [wk10Q11] . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
Exercise 4.11 [wk10Q12] . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Exercise 4.12 [wk10Q13] . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
4.2 Multiple Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Exercise 4.13 [wk11Q1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Exercise 4.14 [wk11Q2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Exercise 4.15 [wk11Q3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Exercise 4.16 [wk11Q4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
Exercise 4.17 [wk11Q5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
Exercise 4.18 [wk11Q6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
Exercise 4.19 [wk11Q7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Exercise 4.20 [wk11Q8] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Exercise 4.21 [wk11Q9] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
Exercise 4.22 [wk11Q10] . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
Exercise 4.23 [wk11Q11] . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
Exercise 4.24 [wk11Q12] . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Exercise 4.25 [wk11Q13] . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Exercise 4.26 [wk11Q14] . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
Exercise 4.27 [wk11Q15] . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
Exercise 4.28 [wk11Q16] . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Exercise 4.1 [wk10Q1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Exercise 4.2 [wk10Q2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
Exercise 4.3 [wk10Q3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Exercise 4.4 [wk10Q4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
Exercise 4.5 [wk10Q5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
vii
Exercise 4.6 [wk10Q6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

Exercise 4.7 [wk10Q7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
Exercise 4.8 [wk10Q8] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Exercise 4.9 [wk10Q9] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
Exercise 4.10 [wk10Q11] . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
Exercise 4.11 [wk10Q12] . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Exercise 4.12 [wk10Q13] . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
Exercise 4.13 [wk11Q1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
Exercise 4.14 [wk11Q2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
Exercise 4.15 [wk11Q3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
Exercise 4.16 [wk11Q4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
Exercise 4.17 [wk11Q5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
Exercise 4.18 [wk11Q6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
Exercise 4.19 [wk11Q7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
Exercise 4.20 [wk11Q8] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
Exercise 4.21 [wk11Q9] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
Exercise 4.22 [wk11Q10] . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
Exercise 4.23 [wk11Q11] . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
Exercise 4.24 [wk11Q12] . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
Exercise 4.25 [wk11Q13] . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
Exercise 4.26 [wk11Q14] . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
Exercise 4.27 [wk11Q15] . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
Exercise 4.28 [wk11Q16] . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
viii
Schedule of Tutorial Exercises
Exercises Before Tutorial During Tutorial After Tutorial Additional

Week 1 1.1, 1.2 1.10, 1.11, 1.12, 1.14, 1.15, 1.16, 1.3-1.9
1.13 1.17
Week 2 1.18, 1.19 1.21, 1.20, 1.22 1.23, 1.24, 1.25
Week 3 1.34, 1.35, 1.36, 1.26, 1.28, 1.37 1.29, 1.30, 1.31, 1.33
1.27 1.32
Week 4 1.38, 1.46 1.40(1,2,3), 1.39, 1.42, 1.43, 1.44,
1.41, 1.40(4,5) 1.45
Week 5 2.1, 2.7 2.2, 2.8, 2.9, 2.3 2.10, 2.11, 2.4, 2.12, 2.13, 2.5,
2.12 2.6
Week 6 2.14, 2.15 2.17, 2.16, 2.19, 2.20, 2.21, 2.22, 2.24, 2.25, 2.26,
2.18 2.23 2.27
Week 7 3.1 3.2, 3.3, 3.6 3.4, 3.5
Week 8 3.7 3.8, 3.13, 3.16, 3.9, 3.10, 3.11, 3.15
3.14 3.12
Week 9 3.17 3.18, 3.21, 3.23 3.19, 3.20, 3.22
Week 10 4.4 4.3, 4.5, 4.6 4.1, 4.2, 4.7, 4.8
Week 11 4.16, 4.13, 4.14, 4.15 4.9, 4.10, 4.11
Week 12 4.17, 4.18 4.19-4.22, 4.27, 4.23-4.26,4.28
Table 1: Schedule of tutorial exercises.
1
Module 1
Probability Theory
Preliminaries
Exercise 1.1: [wk01Q1, Solution, Schedule] An urn contains one black ball and one gold ball while
a second urn contains one white and one gold ball. One ball is selected at random from each urn.
1. Describe the sample space for this experiment.
2. Describe a -algebra for this experiment.
3. Describe the event that both balls will be of the same colour. What is the probability of this
event?
Exercise 1.2: [wk01Q2, Solution, Schedule] A box contains 100 Christmas balls: 49 are red, 34 are
gold, and 17 are silver. Three balls are to be drawn without replacement. Determine the probability
that:
1. all 3 balls are red;
2. the balls are drawn in the order: red, gold, and silver;
3. the third ball is a silver, given that the first 2 are red and gold (not necessarily in that order); and
4. the first 2 are red, given that the third ball is a silver;
Exercise 1.3: [wk01Q3, Solution, Schedule] Let A and B be two independent events. Prove that the
following pairs are also independent:
1. A and BC
2. AC and B
3. AC and BC
Exercise 1.4: [wk01Q4, Solution, Schedule] A pair of events A and B cannot be simultaneously
mutually exclusive and independent. Assume that their probabilities are strictly positive, i.e., Pr (A) >
0 and Pr (B) > 0. Prove the following:
2
1. If A and B are mutually exclusive, then they cannot be independent.
2. If A and B are independent, then they cannot be mutually exclusive.
Exercise 1.5: [wk01Q5, Solution, Schedule] This exercise shows that independence does not imply
pairwise independence. Consider a random experiment which consists of tossing two dice. Define
the following events:
E1 = doubles appear

E2 = {the sum is between (and includes) 7 and 10}

E3 = {the sum is 2 or 7 or 8}
1. Show that E1 , E2 and E3 are independent.
2. Show that E1 and E2 are not pairwise independent.
3. Show that E2 and E3 are not pairwise independent.
4. What about E1 and E3 are they pairwise independent?
Exercise 1.6: [wk01Q6, Solution, Schedule] In an undergraduate statistics class, three students A, B,
and C submitted exactly (word-for-word) the same solution to a homework problem. It is the policy
of the lecturer to give zero marks for those who copy homework problems. Believing that there must
be one of the three who actually did the work, the lecturer will pardon one of the three and chooses at
random the student to pardon.
However, the lecturer will only inform the students at the end of the semester who among them has
been pardoned.
The next day, A tries to get the lecturer to tell him who had been pardoned. The lecturer refuses. A
then asks which of B or C will not be pardoned. The lecturer thinks for a while, then tells A that B is
not to be pardoned.
Lecturers reasoning: Each student has a 1/3 chance of being pardoned. Clearly, either B or C
must not be pardoned, so I have given A no information about whether A will be pardoned.
As reasoning: Given that B will not be pardoned, then either A or C will be pardoned. My
chance of being pardoned has risen to 1/2.
1. Evaluate the lecturers reasoning, i.e., explain whether his reasoning is justified.
2. Evaluate student As reasoning, i.e., explain whether his reasoning is justified.
Exercise 1.7: [wk01Q7, Solution, Schedule] Two airlines serving some of the same cities in Australia
have merged. Management has decided to eliminate some of the repetitious daily flights. On the
Perth-Sydney route, one airline originally had five daily flights (each at different a time) and the other
had six daily flights (each at different a time). Determine the number of ways:
1. four flights can be eliminated.
2. the first airline can eliminate two of its scheduled five flights.
3. the second airline can eliminate two of its scheduled six flights.
3
4. two flights can be eliminated from each airline.
Exercise 1.8: [wk01Q8, Solution, Schedule] Three boxes are numbered 1, 2 and 3. For k = 1, 2 and
3, box k contains k blue marbles and 5 k red marbles. In a two-step experiment, a box is selected and
2 marbles are drawn from it without replacement. If the probability of selecting box k is proportional
to k, what is the probability that the two marbles drawn have different colors?
1.1 Mathematical Methods

Exercise 1.9: [wk01Q9, Solution, Schedule] The probability function of a certain discrete random
variable on the non-negative integers satisfies the following:
Pr(0) = Pr(1)
Pr(k + 1) = Pr(k)/k for k = 1, 2, 3, . . ..
Find Pr(0).
Exercise 1.10: [wk01Q10, Solution, Schedule] Consider X, a continuous random variable with den-
sity function
fX (x) = cex , x > 1, and zero otherwise.
Find
1. all c such that f x is a random variable, and

2. Pr(X < 3 | X > 2).
Exercise 1.11: [wk01Q11, Solution, Schedule] The distribution function for a discrete random vari-
able X is given by:
if x < 1

0
F X (x) = if 1 x < 2/3

1/3

if x 2/3.

1
1. Specify the probability mass function pX (x).

2. Sketch the graphs of pX (x) and F X (x).
Exercise 1.12: [wk01Q12, Solution, Schedule] Let X be a random variable with density:
1 x 2
" #
1
fX (x) = exp , for < x < .
2 2
Here, X is called a normally distributed random variable.
1. Find an expression for the moment generating function, MX (t) of X.

2. Now define S (t) = log [MX (t)]. Show that, in general,

d2

d
S (t) = E [X] and S (t) = Var (X) .

dt dt 2
t=0 t=0
4
3. Use the above result to prove that, with the normal density, we have
E (X) = and Var (X) = 2 .
4. How do we call the function S (t)?
Exercise 1.13: [wk01Q13, Solution, Schedule] Let X be a random variable with parameters , , ,
and <, and have the following moment generating function: MX (t) = 1 + t + t2 + t3 + t4 .
1. How many distribution functions corresponds to this m.g.f. for given values of the parameters?
2. Determine the first five non-central moments of X.
3. Determine the first five central moments of X.
4. Determine the mean, variance, skewness, and kurtosis of X.
5. Let X represents the claim sizes, i.e., a higher value is bad for the insurer. Insurer A and Bi
ask a quote for reinsuring a tail risk (for example: the reinsurer makes a payment to the insurer
if the loss is larger than $1 million). Based on the mean, variance, skewness and kurtosis, which
of the two would receive a higher quote for reinsuring the risk, and why, if:
i) As parameters are: = 1, = 2, = 1, and = 1 and B1 parameters are = 1,

= 1, = 0.5384, and = 0.2606;
ii) As parameters are: = 1, = 2, = 1, and = 1 and B2 parameters are = 1, = 2,
= 2, and = 2;
iii) As parameters are: = 1, = 2, = 1, and = 1 and B3 parameters are = 1, = 2,
= 1, and = 1.625.
Exercise 1.14: [wk01Q14, Solution, Schedule] The probability density function for a continuous
random variable X is given by:
(
2/x3 for x 1
fX (x) =
0 otherwise.
1. Determine a formula for the cumulative distribution function F X (x).
2. Determine the probability that X 4.
3. Sketch the graphs of fX (x) and F X (x).
Exercise 1.15: [wk01Q15, Solution, Schedule] Let X be a random variable with probability density
function:
e , if x 0;
1 x

2

fX (x) =

1 ex , if x < 0.

2
1. Verify that fX () is a pdf.
2. Find expression for the cdf F X (x).
3. Find the moment generating function and the probability generating function of X.
5
4. Suppose = 1. Evaluate Pr (|X| < 3/4).
Exercise 1.16: [wk01Q16, Solution, Schedule] Actuaries often model the age-at-death as a non-
negative random variable X and define the force of mortality as follows:
F X (x + h) F X (x)
(x) = lim ,
h0 h (1 F X (x))
where F X () denotes the cdf of X.
1. Using this definition, prove that:

Z x !
F X (x) = 1 exp (z) dz .
0
2. Show that for a non-negative random variable:

Z
E [X] = [1 F X (z)] dz.
0
Use this result to show that:

" #
1
E [X] = E .
(X)
Exercise 1.17: [wk01Q17, Solution, Schedule] A random variable X has a probability density func-
tion of the form:

fX (x) = ax 1 bx2 , for 0 x 1, and zero otherwise,
where a and b are positive constants.
1. Determine the value of a in terms of b and show that b 1.
2. For the case b = 1, determine the mean and variance of X.
1.2 Univariate Distributions

Exercise 1.18: [wk02Q1, Solution, Schedule] For each of the following situations, specify the type
of distribution that best models the random variable X and give the parameters of the distribution
chosen (where possible):
1. This year, there are 100 students enrolled in an introductory actuarial studies course. For the
mid-session test for this course, the papers are marked by a team of tutors; however, a sample of
these papers is examined by the course professor for marking consistency. Experience suggests
that 1% of all papers will be improperly marked. The professor selects 10 papers at random
from the 100 papers and examines them for marking inconsistencies. X is the number of papers
in the sample that are improperly marked.
2. A standard drug has been known to be effective in 90% of the cases in which it is used. To
re-evaluate the effectiveness of this same drug, a clinical trial will be performed where 20 has
volunteered. X is the number of cases where the drug has been found effective.
6
3. An immunologist is studying blood disorders exhibited by people with rare blood types. It
is estimated that 10% of the population has the type of blood being investigated. Volunteers
whose blood type is unknown are tested until 100 people with the desired blood type are found.
X is the number of people tested who do not have the desired rare blood type.
4. Customers arrive at a fastfood restaurant independently and at random. During lunch hour,
where more customers are often expected to arrive, customers arrive at the fastfood restaurant
at the rate of two per minute on the average. X is the number of people who arrive between 12:15
p.m. and 12:30 p.m.
5. A set of 25 multiple-choice questions was asked in an examination. It has been determined,

according to experience, that the proportion of the questions which are guessed and answered
correctly is 35%. X is the number of questions guessed and answered correctly by a particular
student who wrote for the examination.
Exercise 1.19: [wk02Q2, Solution, Schedule] For each of the following moment generating functions
of discrete random variables X, identify the distribution and specify the associated parameters.
et
1. MX (t) =
2 et
!3
et + 1
2. MX (t) =
2
!
1 1
3. MX (t) = exp et
2 2
!4
et
4. MX (t) =
2 et
!5
3et + 1
5. MX (t) =
4
Exercise 1.20: [wk02Q3, Solution, Schedule] Poisson approximation to the binomial. This exercise
is to show that binomial probabilities can be approximated using the Poisson probabilities, which
are generally easier to calculate. Let X Binomial(n, p) and Y Poisson() where = np. The
approximation states that
Pr (X = x) Pr (Y = x) ,
for large n and small np. This can be proven using convergence of mgfs. Denote the respective mgfs
by MX (t) and MY (t).
1. Prove that limn MX (t) = MY (t).

1
Hint: use lim (1 + nx )n = exp(x) = lim(1 + hx) h .
n h0
2. Another method to prove this approximation is as follows: First, establish that the Poisson
distribution satisfies the relation
Pr (Y = x)
= for x = 1, 2, . . . .
Pr (Y = x 1) x
7
Second, a similar relation can be approximated for the binomial distribution:

Pr (X = x) np
lim = .
p0 Pr (X = x 1) x
Hint: first show that Pr(X=x)

Pr(X=x1)
= nx+1
x
p
1p
, then take lim and use that lim(np) = . Then show
p0 p0
that:
Pr (Y = 0) Pr (X = 0) ,
for large n.
3. A typesetter, on the average makes one error in every 400 words typeset. A typical page contains
300 words. Use the Poisson approximation to the binomial to compute the probability that there
will be more than 3 errors in 10 pages.
Exercise 1.21: [wk02Q4, Solution, Schedule] An insurance company receives 200 claims per day on
the average. Claims arrive independently and at random at the company office. Of the claims, 95%
are for amounts less than $100 and are processed immediately; the remaining 5% are examined more
closely to verify their accuracy and eligibility.
1. Determine the probability of getting no claims over $100 in a given day.
2. Determine the probability of getting at most two claims over $100 in a given day.
3. How many claims for amounts less than $100 should this company expect to receive in a week
(5 business days)?
Exercise 1.22: [wk02Q5, Solution, Schedule] Let X have a Gamma(, ) distribution.
1. Prove that the mgf of X can be expressed as:

!

t
for t < .
2. Establish also that for any positive constant r

(r + )
E [X r ] = r .
()
Exercise 1.23: [wk02Q6, Solution, Schedule] Suppose that you have $1 000 to invest for a year. You
are currently evaluating two investments: Investment A and Investment B, with annual rates of return,
respectively denoted by RA and RB . Assume:
RA Normal (0.05, 0.1) and RB Normal (0.10, 0.5) .
1. Under Investment A, compute the probability that your investment will be below $1 000 in a
year.
2. Under Investment A, compute the probability that your investment will exceed $1 200 in a year.
3. Under Investment B, compute the probability that your investment will below $1 000 in a year.
8
4. Under Investment B, compute the probability that your investment will exceed $1 200 in a year.
Exercise 1.24: [wk02Q7, Solution, Schedule] A city engineer has studied the frequency of accidents
at two busy intersections. He has determined that the time T in months between accidents at each
intersection has an exponential distribution. The parameters for these two distributions are 2 and 2.5.
Assume that the occurrence of accidents at these intersections is independent.
1. Determine the probability that there are no accidents at either intersection in the next month.
2. Determine the probability that there will be no accidents for at least one of the intersections in
the next month.
Exercise 1.25: [wk02Q8, Solution, Schedule] The Pareto distribution is very commonly used to
model certain insurance loss amounts. We say X has a Pareto distribution if its density can be ex-
pressed as:
+1
fX (x) = for x > ,
x
and zero otherwise.
1. Find expressions for the mean and variance of X.
quantile of X. The quantile function is f (u) = F X (u) hence, one should

1
2. Find expression
for the
solve u = F X F X1 (u) .
3. What is then its median (i.e., u = 0.5)?
4. An insurance policy has a deductible1 of $5. The random variable for the loss amount (before
deductible) on claims filed has a Pareto distribution with = 3.5 and = 4. Find:
(a) the mean loss amount;

(b) the expected value of the amount of a single claim; and
(c) the variance of the amount of a single claim.
1.3 Joint and Multivariate Distributions

Exercise 1.26: [wk03Q4, Solution, Schedule] Let X and Y be two discrete random variables whose
joint probability function is given by:
Pr(X = x, Y = y) X=x
0 1 2 3
Y=y 1 0.05 0.20 0.15 0.05
2 0.20 0.15 0.12 0.08
Calculate:
1
A deductible is that the policy only makes no payment if the loss amount is smaller than the deductible; and the claim
amount equals the loss amount minus the deductible if the loss amount is larger than the deductible.
9
1. E [X]
2. E [Y]
3. E [X |Y = 1]
4. Var (Y |X = 3)
5. E [XY] and Cov(X,Y).
Exercise 1.27: [wk03Q5, Solution, Schedule] Let X and Y be two discrete random variables whose
joint probability mass function is given by:
Pr(X = x, Y = y) X=x
1 2 3 4
1 0.10 0.05 0.02 0.02
Y=y 2 0.05 0.20 0.05 0.02
3 0.02 0.05 0.20 0.04
4 0.02 0.02 0.04 0.10
1. Find the marginal probability mass functions of X and Y.

2. Find the conditional probability mass of X given Y = 2 and of Y given X = 2.
3. Find E[XY] and Cov(X, Y).
Exercise 1.28: [wk03Q6, Solution, Schedule] Let X and Y have the joint density:
6
fX,Y (x, y) = (x + y)2 , for 0 x 1 and 0 y 1,
7
and zero otherwise.
1. By integrating over the appropriate regions, find:

(a) Pr (X < Y)
(b) Pr (X + Y 1)

(c) Pr X 21
2. Find the marginal densities of X and Y.
3. Find the two conditional densities.
Exercise 1.29: [wk03Q8, Solution, Schedule] Let xn and s2n denote the sample mean and variance
for the sample x1 , x2 , . . . , xn . Let xn+1 and s2n+1 denote these quantities when an additional observation
xn+1 is added to the sample.
1. Show how xn+1 can be computed from xn and xn+1 .

2. Show that:
!
n1 2 1
s2n+1 = sn + (xn+1 xn )2
n n+1
so that s2n+1 can be computed from xn , xn+1 , and s2n .
10
Exercise 1.30: [wk03Q9, Solution, Schedule] Suppose X and Y are two continuous random variables.
Prove that:
Z
E [Y] = E [Y |X = x ] fX (x) dx.

Exercise 1.31: [wk03Q10, Solution, Schedule] You are given:
X1 Uniform[0, 1]
Conditional on X1 , X2 Uniform[0, X1 ]
1. Find the joint distribution function of X1 and X2 .
2. Find the marginal distribution function of X2 .
Exercise 1.32: [wk03Q11, Solution, Schedule] Suppose that the joint distribution function of X1 and
X2 is given by
if x1 < 0 or x2 < 0;

0, h i
x1 x2 1 + 12 (1 x1 ) (1 x2 ) ,

if 0 x1 1 and 0 x2 1;
F X1 ,X2 (x1 , x2 ) =

F x1 (x1 ), if x2 > 1;

if x1 > 1,

F x2 (x2 ),
1. Find the joint density.
2. Find the marginal distribution functions of X1 and X2 . Can you recognise these distributions?
3. Find the correlation coefficient of X1 and X2 .
Exercise 1.33: [wk03Q12, Solution, Schedule] We have the joint probability density function:
(
k(1 x2 ), if 0 x1 x2 1;
fX1 ,X2 (x1 , x2 ) =
0, else.
1. Determine the value k for which this function is a density.
2. Determine the region for the integral for determining Pr(X1 3/4, X2 1/2).
3. Calculate Pr(X1 3/4, X2 1/2).
1.4 Sampling and Summarising Data

Exercise 1.34: [wk03Q1, Solution, Schedule] The claim amounts (in dollars, to the nearest $10) for
a sample of 24 recent claims for storm damage to private homes in a particular town are as follows:2
2 710 670 2 380 4 670 1 220 6 780 1 590 3 110

960 8 230 3 320 3 380 2 490 1 940 3 710 4 630
4 270 4 210 1 880 3 880 1 490 5 400 2 430 850
2
Modified Institute of Actuaries exam question.
11
1. Construct a stem-and-leaf display of these claim amounts.
2. Find the mean and median of the claim amounts. What can you say about the skewness of the
distribution?
3. Find the interquartile range of the claim amounts.
4. Evaluate F24 (1 000) where F24 () denotes the ecdf.
Exercise 1.35: [wk03Q2, Solution, Schedule] Data were collected on 100 consecutive days for the
number of claims, X, arising from a group of insurance policies. This resulted in the following
frequency distribution:
observed claims from policy (x): 0 1 2 3 4 5

frequency: 14 25 26 18 12 5
Calculate the following sample statistics for these data:
1. mode
2. median
3. interquartile range
4. Suppose the average value for 5 claims or more is 7.5. Calculate the sample mean.
Exercise 1.36: [wk03Q3, Solution, Schedule] For a set of 32 observations, you are given:
32
X 32
X
xk = 13 337.6 and xk2 = 5 667 388.7.
k=1 k=1
The largest of the observations is 605. Suppose you are interested in measuring the impact of the
largest observation on the mean and standard deviation.
1. Calculate the sample mean and the sample standard deviation.
2. Calculate the sample mean and the sample standard deviation, with the largest observation
deleted.
3. What is the percentage change in the mean?
4. What is the percentage change in the standard deviation?
Exercise 1.37: [wk03Q7, Solution, Schedule] Two independent measurements, X and Y, are taken
of a quantity . We are given the means are equal, E [X] = E [Y] = , but the variances 2X and 2Y
are not equal. The two measurements are then combined by means of a weighted average to give:
Z = X + (1 ) Y,
where is a constant between 0 and 1, i.e., 0 1.
1. Show that E [Z] = .
12
2. Find in terms of X and Y to minimise Var (Z).
3. Under what circumstances is it better to use the average (X + Y) /2 than either X or Y alone to
determine ? Hint: a smaller variance would give a better estimate of the population mean.
4. Now, suppose X and Y are instead not independent and have covariance:
Cov (X, Y) = XY .
Find in terms of X , Y and XY to minimise Var (Z).
1.5 Functions of Random Variables

Exercise 1.38: [wk04Q1, Solution, Schedule] Compound Distribution. In a portfolio of insurance
policies, the amount of claim is a random variable Xk which has an exponential distribution with
1
mean , for k = 1, 2, . . . The number of claims N in a single period is also a random variable but with

a Poisson() . The total claims then in the portfolio during the period is given by:
S = X1 + X2 + . . . + X N .
1. Find the mean of S , E[S ].
2. Find the variance of S , Var(S ).
3. Find the moment generating function of S , MS (t).
Exercise 1.39: [wk04Q2, Solution, Schedule] Let X1 , X2 and X3 be i.i.d. with common density:
fX (x) = ex , x 0.
1. Find the joint density of X(1) , and X(3) .

Hint: First find the joint density of X(1) , X(2) , and X(3) , i.e., fX(1) ,X(2) ,X(3) (y1 , y2 , y3 ) = . . .
Second you find the distribution of only X(1) and X(3) by integrating over the other random
variable (similar to finding the marginal distribution). Be careful by the limits for X(2) , what are
the lowest and highest numbers it can take?

2. Compute E X(1) and E X(3) .

3. Compute Var X(1) and Var X(3) .
4. Compute E X(1) X(3) and the correlation coefficient X(1) , X(3) .

Exercise 1.40: [wk04Q3, Solution, Schedule] Let X Gamma(, 1) and Y Gamma(, 1) be inde-
pendent random variables. Define U = X + Y and V = X/(X + Y).
1. Use the moment generating function technique to find the distribution of U.
2. Use the Jacobian transformation technique to find the joint distribution of U and V.
3. Show that U and V are independent.

Hint: You do not need to do any additional calculations to show this.
13
4. Find the marginals of U and V using their joint distribution derived in part 2. Demonstrate the
the marginal of U is consistent with that derived from Exercise 1.40 part 1.
5. Use Exercise 1.40 part 3 and 4, to find the mean and variance of V.
Exercise 1.41: [wk04Q4, Solution, Schedule] Let X1 , X2 and X3 be three independent and identically
distributed as Exp(1) random variables. Find:
h i
1. E X(3) X(1) = x
h i
2. E X(1) X(3) = x
3. fX(1) ,X(3) (x, y)
4. fR (r), where R = X(3) X(1) is the range.
Exercise 1.42: [wk04Q5, Solution, Schedule] Let X1 and X2 be i.i.d. (independent and identically
distributed) N (0, 1) random variables.
1. Show that X1 + X2 has a normal distribution and specify its parameters.
2. Show that X1 X2 has the same distribution as X1 + X2 .
3. Suppose X1 and X2 are no longer independent but each still has N (0, 1) distribution. Will X1 +X2
and X1 X2 be still independent?
4. Let X Gamma(, ) distributed.
(a) Find the p.d.f. of an Inverse Gamma Distribution, i.e., find the p.d.f. of Y = X1 .
(b) Find the c.d.f. of the inverse gamma distribution as function of the c.d.f. of the gamma
distribution.
Exercise 1.43: [wk05Q14, Solution, Schedule] Using the p.d.f. of a chi-squared distribution with
one degree of freedom:
exp(y/2)
fY (y) = p , if y > 0,
2y
and zero otherwise, prove that the moment generating function of Y is given by:
MY (t) = (1 2t)1/2 .
Exercise 1.44: [wk05Q15, Solution, Schedule] Prove that:

d
tn1 N(0, 1) as n ,
where you might use that:

n+1 r
2 n
lim = .
n n
2
2
14
Exercise 1.45: [wk05Q16, Solution, Schedule] Prove that the p.d.f. of a Snecdors F distribution,
given by the transformation:
U/n1
F= ,
V/n2
where U 2 (n1 ) and V 2 (n2 ), is given by:
((n1 + n2 )/2) f n1 /21

fF ( f ) = nn11 /2 nn22 /2 .
(n1 /2) (n2 /2) (n2 + f n1 )(n1 +n2 )/2
Exercise 1.46: [wk04Q6, Solution, Schedule]
I Let Z1 and Z2 be two independent N (0, 1) random variables and let V1 2 (r1 ) and V2
2 (r2 ) be two independent chi-squared random variables. Which of the following random vari-
ables has a t-distribution with degrees of freedom (r1 + r2 )?
Z1 + Z2
(A)
(V1 + V2 ) /(r1 + r2 )
Z1 + Z2
(B)
(V1 /r1 ) + (V2 /r2 )
Z1 + Z2
(C)
2 (V1 + V2 ) /(r1 + r2 )
Z1 Z2
(D)
(V1 + V2 ) /(r1 + r2 )
Z1 Z2
(E) +
V1 /r1 V2 /r2
II Let Z1 and Z2 be two independent standard normal random variables. Which of the following
combinations of the two has also a standard normal random variable?
(A) (Z1 + Z2 ) /2
(B) Z1 + Z2
(C) Z1 /Z2
(D) Z1 Z2

(E) (Z1 Z2 ) / 2
III Let Z1 N (0, 1) and Z2 N (0, 1) be two random variables with correlation coefficient
(Z1 , Z2 ) = ,
where 1 1. Let V be a 2 (r) random variable independent of Z1 and Z2 .

Which of the following has a t-distribution with r degrees of freedom?

i. rZ1 V 1/2

ii. rZ2 V 1/2
r
r
iii. (Z1 + Z2 ) V 1/2
2
r
r
iv. (Z1 + Z2 ) V 1/2
2 ( + 1)
15
(A) All but i

(B) All but ii
(C) All but iii
(D) All but iv
(E) All
IV Let X1 , X2 , . . . , Xn be i.i.d. (independent and identically distributed) Exp() random variables

1
(m.g.f.: MXi (t) = 1 t ). Which of the following describes the distribution of the sample
mean:
n
1X
X= Xk ?
n k=1
(A) X Exp()
(B) X Exp(n)
(C) X Exp(/n)
(D) X Gamma(n, n)
X Gamma n, n

(E)
n
Note: m.g.f. of Gamma: MXi (t) = 1 t ).
V Let X1 , . . . , Xn be n independent and identically distributed Poisson random variables with mean
. Describe the distribution of the sum of these random variables:
Xn
S = Xk .
k=1
(A) S Poisson(1)
(B) S Poisson()
(C) S Poisson(/n)
(D) S Poisson(n)
(E) Cannot be determined from the given information
VI Suppose X1 , X2 , . . . , X20 are twenty independent random variables and are identically distributed
as Exp(2). Determine Pr X(20) > 1 .

(A) Pr X(20) > 1 = 0.94

(B) Pr X(20) > 1 = 0.95

(C) Pr X(20) > 1 = 0.96

(D) Pr X(20) > 1 = 0.97

(E) Pr X(20) > 1 = 0.98

VII Let X1 , X2 , . . . , Xn be n i.i.d. (independent and identically distributed) random variables each
with density:
fX (x) = 2x, for 0 < x < 1,
and zero otherwise.

Determine E X(n) .

16
(A) E X(n) = n/(n + 2)

(B) E X(n) = n/(n + 1)

(C) E X(n) = 1

(D) E X(n) = 2n/(2n + 1)

(E) E X(n) = 2n/(n + 1)

VIII In a 100-meter Olympic race, the running times are considered to be uniformly distributed
between 8.5 and 10.5 seconds. Suppose there are 8 competitors in the finals. The current world
record is 9.9 seconds.
Determine the probability that the loser of the race will not break the world record.
(A) 0.54
(B) 0.64
(C) 0.74
(D) 0.84
(E) 0.94
17
Solutions
Solution 1.1: [wk01Q1, Exercise, Schedule] Urn 1: 1 Black (B), 1 Gold (G) and Urn 2: 1 White
(W), 1 Gold (G). Define B = black ball, G = gold ball, W = white ball
1. = {BW, BG, GW, GG}

, {BW} , {BG} , {GW} , {GG} , {BW, BG} , {BW, GW} , {BW, GG} ,

2. F =

{ BG, GW} , {BG, GG} , {GW, GG} , {BW, BG, GW} , {BW, BG, GG} ,

{BW,GW, GG} , {BG, GW, GG} , {BW, BG, GW, GG}

3. Let E be the event of getting the same color for both balls. Then E = {GG} and Pr (E) =

Pr (Urn 1 = G Urn 2 = G) = Pr (Urn 1 = G) Pr (Urn 2 = G) = 12 12 = 14 , * using indepen-
dence.
Solution 1.2: [wk01Q2, Exercise, Schedule]
1. Define R = red ball, G = gold ball, S = silver ball.

Draw without replacement: use combination. 100
possible combinations RRR (n=49, r=3) = 49 3
, total combinations (n=100, r=3) = 3
! !
49 100 49 48 47 94
Pr(RRR) = = = = 0.1139
3 3 100 99 98 825
2. Pr(RGS ) = 49
100
34
99
17
98
= 289
9900
= 0.0292, using multiplication rule.
3. Let A = 3rd ball is silver and B = first 2 are red and gold. Thus,
Pr (A B) Pr (RGS GRS )
Pr (A |B) = =
Pr (B) Pr (RG. GR.)
where !
49 34 17
Pr (RGS GRS ) = 2!
100 99 98
and !
49 34 98
Pr (RG. GR.) = 2!
100 99 98
Therefore, we have the required probability:
17
Pr (A |B) = = 0.1735.
98
4. Let C = first 2 are red. Thus,
Pr (A C) Pr (RRS)
Pr (C |A ) = =
Pr (A) Pr (A)
where
49 48 17
Pr (RRS ) =

100 99 98
and note that for the event A, you either have 0, 1, or 2 silver in the first two so that
! ! !
83 82 17 83 17 16 17 16 15
Pr (A) = + 2! + .
100 99 98 100 99 98 100 99 98
After simplifying, the required probability is:
49 48
Pr (C |A) = = 0.2424.
(83 82) + 2 (83 16) + (16 15)
18
Solution 1.3: [wk01Q3, Exercise, Schedule] Let A and B be two independent events so that Pr (A B) =
Pr (A) Pr (B) , Pr (B|A) = Pr (B) , and Pr (A|B) = Pr (A).
1. To show A and BC are independent, we have:

Pr A BC = Pr BC |A Pr (A)

= 1 Pr (B |A) Pr (A)

| {z }
Pr(B)

= Pr BC Pr (A) ,
thus independent.
2. To show AC and B are independent, we have:

Pr AC B = Pr AC |B Pr (B)

= 1 Pr (A |B) Pr (B)

| {z }
Pr(A)

= Pr AC Pr (B) ,
thus independent.
3. It then becomes straightforward to show AC and BC are independent. Given that A and B are
independent, we know from part (b) that AC and B are also independent. Applying (a), then AC
and BC must also be independent.
1. Let A and B be mutually exclusive, i.e., A B = and Pr (A B) = 0. Suppose they are also
independent. Then
Pr (A B) = Pr (A) Pr (B) = 0.
Therefore, either Pr (A) = 0 or Pr (B) = 0. But, both Pr (A) > 0 or Pr (B) > 0 by assumption.
This is a contradiction so that A and B cannot be independent.
2. Now suppose A and B are independent, i.e., Pr (A B) = Pr (A) Pr (B). Suppose they are
mutually exclusive. Then Pr (A B) = 0 which implies Pr (A) Pr (B) = 0 and following
similar argument above, this cannot be true. Therefore they cannot be mutually exclusive.
Solution 1.5: [wk01Q5, Exercise, Schedule] There are a total of 6 6 = 36 possible outcomes.
We have that: E1 = {(1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6)},
E2 = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1), (2, 6), (3, 5), (4, 4), (5, 3), (6, 2), (3, 6), (4, 5), (5, 4), (6, 3), (4, 6), (5, 5),
and
E3 = {(1, 1), (1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1), (2, 6), (3, 5), (4, 4), (5, 3), (6, 2)}. Thus, by counting
from these possible outcomes, we see that:
Pr (E1 ) = 6
36
= 1
6
Pr (E2 ) = 18
36
= 1
2
Pr (E3 ) = 12
36
= 1
3
19
1. Note that
1
Pr (E1 E2 E3 ) = Pr (double and sum is 8) = Pr ((4, 4)) =
36
and
1 1 1 1
Pr (E1 ) Pr (E2 ) Pr (E3 ) = = .
6 2 3 36
Thus, are independent.
2. However,
1
Pr (E1 E2 ) = Pr (double and sum is 8 or 10) = ,
18
which is not equal to:
1 1 1
Pr (E1 ) Pr (E2 ) = = .
6 2 12
3. Note that:
11
Pr (E2 E3 ) = Pr (sum is 7 or 8) =
36
which is not equal to:
1 1 1
Pr (E2 ) Pr (E3 ) = = .
2 3 6
4. Consider:
2 1
Pr (E1 E3 ) = Pr (doubles and sum is 2 or 8) = =
36 18
and note that
1 1 1
Pr (E1 ) Pr (E3 ) = = .
6 3 18
Therefore, they are independent.
Solution 1.6: [wk01Q6, Exercise, Schedule] Let A = student A is pardoned, B = student B is

pardoned, C = student C is pardoned, and Z = Lecturer says B is not pardoned.

Pr (Z) = Pr (Z A) + Pr (Z B) + Pr (Z C)

= Pr (A) Pr (Z|A) + Pr (Z B) + Pr (C) Pr (Z|C)
1 1 1 1
= +0+ 1= .
3 2 3 2
* using law of total probability, ** using multiplication rule.
1. Thus, using the lecturers reasoning,

Pr lecturer says B is not pardoned and A is pardoned
Pr (A |Z ) =
Pr (Z)
1/6 1
= = .
1/2 3
Lecturers reasoning is clearly justified as the additional information provides no change in the
probability.
2. However, A falsely interprets the event Z as equal to the event BC (event B is not pardoned) and
calculates:
Pr A BC 1/3 1
Pr A BC = C
= = .
Pr (B ) 2/3 2
20
Student As reasoning is not justified, i.e., the event Z has more information than the event B.
The lecturer does provide extra information on the probability that event C will happen, i.e.,
Pr (C |Z ) = Pr(CZ)
Pr(CZ)
= 1/2
1/3
= 23 .
Using x pX (x) = 1, Pr (C |Z ) = 2/3 and Pr (B |Z ) = 0 we have that Pr (A |Z ) = 1/3
P

! !
n 11
1. Use combinations. We have n = 11, r = 4: = = 330 ways to eliminate 4 flights.
r 4
! !
n 5
2. Use combinations. We have n = 5, r = 2: = = 10 ways to eliminate 2 flights for the
r 2
first airline.
! !
n 6
3. Use combinations. We have n = 6, r = 2: = = 15 ways to eliminate 2 flights for the
r 2
second airline.
4. Use multiplication
! rule. S 1 = number of ways first airline can eliminate 2 flights,
! with
5 6
n1 = , S 2 = number of ways second airline can eliminate 2 flights, with n2 = .There
2 ! ! 2
5 6
are n1 n2 = = 150 ways to eliminate 2 flights for each airline.
2 2
Solution 1.8: [wk01Q8, Exercise, Schedule] Define D = 2 marbles have different colors, B1 =
Box 1 is selected, B2 = Box 2 is selected,B3 = Box 3 is selected. Let p be the probability that
box 1 is selected. Then p + 2p + 3p = 1. Thus p = 16 . The required probability is:
Pr (D) = Pr (D|B1 ) Pr (B1 ) + Pr (D|B2 ) Pr (B2 ) + Pr (D|B3 ) Pr (B3 )

14 1 23 2 32 3
= 5 + 5 + 5
6 6 6
2 2 2
17
=
30
Solution 1.9: [wk01Q9, Exercise, Schedule] We know: Pr(i) 0 for i = 0, 1, 2, . . . and i Pr(i) = 1.
P
Rewriting the p.d.f. gives: Pr(0) = Pr(1) = Pr(2); Pr(3) = 21 Pr(2) = 21 Pr(0); Pr(4) =
1
3
Pr(3) = 3! Pr(0)
1
Hence, we have: Pr(k) = k1

1
Pr(k 1) = (k1)!
1
Pr(0).
!

Then we rewrite: Pr(i) = Pr(0) + Pr(i) = Pr(0) + (i1)! Pr(0) = 1 + (i1)! Pr(0) = 1.
1 1
P P P P
i=0 i=1 i=1 i=1
Using the series expansion for e x at x = 1, we have Pr(0) = 1
1+e
.
Solution 1.10: [wk01Q10, Exercise, Schedule] X is a continuous random variable with density func-
tion:
fX (x) = cex , x > 1,
and zero otherwise.
R
1. To prove X is a random variable the following two conditions must be satisfied: 1) fX (x)dx =
1 and 2) fX (x) 0, for all x <.
R R
f (x)dx = cex dx = [cex ]
X 1 = ce
1
= 1. Thus for c = e the first conditional holds.
1
21
For c = e we have: fX (x) = ex+1 e = 0 for x > 1 and zero otherwise, hence also the
second condition is satisfied. Thus c = e.
R3 R3
cex dx ex dx R3
c ex dx e2 e3
2. Pr (X < 3|X > 2) = Pr(X<3X>2)
Pr(X>2)
= R2
cex dx
= R2
ex dx
= R 2 x = e2
= 1 e1 = e1
e
.
2 2 c 2
e dx
1.
if x = 1
1

32

pX (x) = if x = 2/3

03

otherwise.
2. graphd of pX :
Solution 1.12: [wk01Q12, Exercise, Schedule] We are given a normal distributed random variable.
1. MX (t) = et+ 2 t (shown in Module 1.2 video lecture)

1 2 2
2. Since S (t) = log [MX (t)], we have

d 1 d 1
S (t) = MX (t) S (t) =
0
MX0 (0) = E (X)
dt MX (t) dt t=0 MX (0)
* using MX (0) = 1 and MX0 = E [X],
and (using the quotient rule for derivatives)
h i2
d2 MX (t) MX00 (t) MX0 (t)
S (t) =
dt2 [MX (t)]2
h i2
MX (0) MX00 (0) MX0 (0)

d2
= = E X 2 [E (X)]2 = Var (X) .

S (t) 2
dt2 t=0 [MX (0)]
h i
** using MX (0) = 1, MX = E [X], and MX00 = E X 2
0
22

3. Thus, S (t) = log (MX (t)) = log exp + 1/22 t2 = t + 12 2 t2 implies S 0 (t) = + 2 t and
S 00 (t) = 2 so that
S 0 (0) = and S 00 (0) = 2
and the result E [X] = and V (X) = 2 immediately follows.
4. This function is called the cumulant generating function of X.
1. By the theorem explained in the lecture notes we know that every m.g.f. corresponds to only
one distribution.
2. First, we determine the first five derivatives of MX (t) with respect to t:
MX(1) (t) = + 2t + 3t3 + 4t3
MX(2) (t) =2 + 6t + 12t2
MX(3) (t) =6t + 24t
MX(4) (t) =24
MX(5) (t) =0
Next, we can easily derive the non-central moments:
E [X] =MX(1) (0) =
h i
E X 2 =MX(2) (0) = 2
h i
E X 3 =MX(3) (0) = 6
h i
E X 4 =MX(4) (0) = 24
h i
E X 5 =MX(5) (0) = 0
3. The central moments can easily be determined using the non-central moments:
E X = = = 0

h i h i
E (X )2 =E X 2 2X + 2
h i
=E X 2 2E [X] + 2
=2 2
h i h i
E (X )3 =E X 3 3X 2 + 3X2 3
h i h i
=E X 3 3E X 2 + 3E [X] 2 3
=6 6 + 23
h i h i
E (X )4 =E X 4 4X 3 + 6X 2 2 4X3 + 4
h i h i h i
=E X 4 4E X 3 + 6E X 2 2 4E [X] 3 + 4
=24 24 + 122 34

h i h i
E (X )5 =E X 5 5X 4 + 10X 3 2 10X 2 3 + 5X3 5
h i h i h i h i
=E X 5 5E X 4 + 10E X 3 2 10E X 2 3 + 5E [X] 4 5
= 120 + 602 203 + 44
23
n
* using the Binomial expansion: (a + b)n = an + 1
an1 b + n2 an2 b2 + . . . + bn with a = X,
b = , and n=2,3,4, or 5.
4. Given the central moments we have:
- Mean: = E [X] = ;
h i
- Variance: 2 = E (X )2 = 2 2 ;
E[(X)3 ] 66+23
- Skewness: = 2 3/2
= ;
E[(X) (22 )3/2
]
E (X)4 ]
- (Excess) Kurtosis: = [
2 34
2 2
= 2424+12 2 3.
E[ (X) ] (2 )
2
5. First, we calculate the mean, variance, skewness, and kurtosis for those four set of parameters.
We have that:
- for A we have: Mean=1, Variance=3, skewness=-0.7698, and (Excess) Kurtosis=-2/3;

- for B1 we have: Mean=1, Variance=1, skewness=-0.7698, and (Excess) Kurtosis=-2/3;
- for B2 we have: Mean=1, Variance=3, skewness= 0.3849, and (Excess) Kurtosis=-2/3;
- for B3 we have: Mean=1, Variance=3, skewness=-0.7698, and (Excess) Kurtosis=1;
Comparing the distributions we have:
i) We have a smaller variance for insurer B1 than A, and the same mean, skewness, and
kurtosis. This implies that a large claim for insurer A is more likely than for insurer B
(i.e., more variability in the claim size for insurer A), hence the price for reinsuring this
risk is larger for insurer A than insurer B.
ii) We have a smaller skewness for insurer A than B2 , and the same mean, variance, and
kurtosis. The negative skewness of insurer A indicates that the probability of a claim
larger than the mean is more than 50%. This implies that a large claim for insurer A is
more likely than for insurer B2 , hence the price for reinsuring this risk is larger for insurer
A than insurer B2 .
iii) We have a smaller kurtosis for insurer A than B3 , and the same mean, variance, and skew-
ness. Hence, the distribution of the claims of insurer A are more flat than of insurer B.
This implies that a large claim for insurer B3 is more likely than for insurer A, hence the
price for reinsuring this risk is larger for insurer B3 than insurer A.
Solution 1.14: [wk01Q14, Exercise, Schedule] The probability density function for a continuous
random variable X is given by:
( 2
, for x 1;
fX (x) = x3
0, otherwise.
Rx Rx Rx h ix
1. F X (x) = Pr(X x) = fX (z)dz = 2
z3
dz = 2 z3 dz = z2 = 1
x2
(1) = 1 x12 for x 1
1
1 1
and zero otherwise.

2. Pr(X 4) = 1 Pr(X < 4) = 1 Pr(X 4) = 1 F X (4) = 1
16
.
* using Pr(X = x) = 0 for continuous random variables.
24
3. graph of fX :
graph FX
~
FX
~
1
0.8
0.6
0.4
0.2
x
1 2 3 4 5

R
1. To show fX () is a density, we have to show that: 1) fX (x) dx = 1 and 2) fX (x) 0 for all
x. For 1) we have:
Z Z 0 Z
1 x 1 x
fX (x) dx = e dx + e dx
2 0 2
" #0 " #
1 x 1 x
= e e
2 2 0
+ = if 0;
1 1
1,
2 2

= if = 0;

0,
1 ( 1 ) = , if < 0,

2 2
and for 2) we have that fX (x) 0 for all x if 0.

Thus, combining 1) and 2) we have fX is a p.d.f. if > 0.
2. For x < 0 we have,

Z x Z x " #x
1 u 1 u 1
F(x) = f x (z)dz = e du = e = ex
2 2 2
and for x 0,
Z 0 Z x " #x
1 u 1 u 1 1 u 1
e du + e du = e = 1 ex .
2 0 2 2 2 0 2
Thus,
1 21 ex , if x 0;

F X (x) =

1 ex ,

if x < 0.
2
25
3. To find the m.g.f.,

Z Z 0 Z
xt 1 x 1
h i
MX (t) = E e Xt
= e fX (x)dx =
xt
e e dx + e xt ex dx
2 0 2
Z 0 Z
1 1
= e x(t+) dx + e x(t) dx
2 2 0
#0 #

" "
1 1
= exp (x (t + )) + exp (x (t ))
2 t+ 2 t 0
1 1 2
= + = ,
2 + t 2 t 2 t2
provided t + > 0 and t < 0 (because e to the power is infinity). This is equivalent to
< t < . Therefore, blindly applying the formula, the p.g.f. would be:
2
PX (t) = MX (ln t) = ,
2 (ln t)2
but this is not a p.g.f. as this is not a discrete distribution (trick question!). Hence, the p.g.f.
does not exist for this/a continuous random variable.
4. Suppose = 1, then
e , if x 0;
1 x

2

fX (x) =

1 e x , if x < 0.

2
and
! ! Z 0 Z 3/4
3 3 3 1 x 1 x
Pr |X| < = Pr < X < = e dx + e dx
4 4 4 3/4 2 0 2
1 1
= 1 e3/4 + 1 e3/4 = 1 e3/4 .
2 2
Or, alternatively:
! ! ! !
3 3 3 3 3
Pr |X| < = Pr < X < = Pr X < Pr X
4 4 4 4 4
! !
3 3
= Pr X Pr X = F(3/4) F(3/4) = 1 e3/4
4 4
* using Pr(X = x) = 0 for continuous r.v.
Solution 1.16: [wk01Q16, Exercise, Schedule] We define the force of mortality as:
F X (x + h) F X (x)
(x) = lim .
h0 h (1 F X (x))
The force of mortality is thus a conditional instantaneous rate of death at age xconditional on
surviving to age x. (The corresponding conditional instantaneous probability of death at age x is thus
x dx.)
1. First, note that we can express this as:

1 F X (x + h) F X (x) F X0 (x) fX (x)
(x) = lim = = .
1 F X (x) |
h0
{zh } 1 F X (x) 1 F X (x)
F X0 (x)
26
Now, integrate both sides from 0 to x and we get:

Z x Z x
fX (z)
(z) dz = dz.
0 0 1 F X (z)
Applying a change of variable u = 1 F X (z), du = fX (z)dz then
Z x Z 1FX (x)
1
(z) dz = du = [ln(u)]1F
1
X (x)
= ln (1 F X (x)) .
0 1 u
Thus we have,
Z x !
exp (z) dz = exp ( ( ln(1) F X (x))) = 1 F X (x) ,
0
which then gives the result.

R
2. By definition of expectation, we know E [X] = z fX (z) dz Now integrate the right-hand side
0
Rb Rb
dv
by applying integration by parts (see F&T page 3), i.e., a u dx dx = [u v]ba a v du
dx
dx:
dv
u = z and
= fX (z)
dx
so that du = dz and v = 1 F X (z). The choice of v is not black magicif you just use
F X (z) then the first term on the RHS below is not finite.3 The only choice F X (z) + C for an
antiderivative of fX (z) that might produce finite terms on the RHS is F X (z) 1.4 Therefore
Z
E [X] = (z) ( fX (z))dz =
0
Z
= [z (1 F X (z))]0 +

(1 F X (z)) dz
0
Z Z
= 0 + 0 + (1 F X (z)) dz = (1 F X (z)) dz.
0 0
Using this result, we have:

Z
E [X] = (1 F X (z)) dz
0
Z ! " #
1 F X (z) 1
= fX (z) dz = E .
0 fX (z) (X)
| {z }
1
(z)
3
Using v = F X (z) we would have:
Z
E [X] = (z) ( fX (z))dz =
0
Z
= [z (F X (z))]0 +

(1 F X (z)) dz
0
Z
= + 0 + (F X (z)) dz.
0
Hence, the only way to get the first part ([z (F X (z) + C)]0 ) to a value smaller than infinity (and larger than minis
infinity) is to set C = 1. R
4
That the integrated term vanishes for x is proven as follows: since E[X] < , the integral 0 x f x (x)dx is
convergent, and hence the tails tend to zero, so
Z Z
x[1 F X (x)] = x fX (t)dt t fX (t)dt 0 for x .
x x
27
Solution 1.17: [wk01Q17, Exercise, Schedule] We are given the p.d.f.:

fX (x) = ax 1 bx2 , for 0 x 1.
R R1
1. We must have f
X
(x) dx = 0
fX (x) dx = 1 so that:
Z 1 Z 1 " #1 !
1 2 1 2b 4
ax 1 bx dx =
2
ax abx dx = ax abx = a
3 4
=1a= .
0 0 2 4 0 4 2b
Since a > 0 (given), we must have b < 2. Also, fX (x) 0 for all x so that at x = 1, fX (1) =
a (1 b) 0 which implies b 1.

2. If b = 1, then a = 4 and fX (x) = 4x 1 x2 . Therefore, mean is
Z 1 8
E [X] = 4x2 1 x2 dx = .
0 15
and Z 1
h i 1
E X 4x3 1 x2 dx = .
2
=
0 3
h i 2
Variance is Var (X) = E X 2 (E [X])2 = 31 15
8
= 225
11
.
1. X Binomial(n = 10, p = 0.01).

2. X Binomial(n = 20, p = 0.90).
3. Let Y be the random variable represented the number of people tested to get 100 rare blood
types. Then Y NB(r = 100, p = 10. We have that X = Y 100, hence: X + 100
N.B.(r = 100, p = 0.10)
4. 2 per minute, thus in 15 minutes, you expect 30 on the average. X Poisson( = 30)
5. X Binomial(n = 25, p = 0.35)
1. Geometric(p = 1/2)
2. Binomial(n = 3, p = 1/2)
3. Poisson( = 1/2)
4. N.B.(r = 4, p = 1/2)
5. Binomial(n = 5, p = 3/4)
Solution 1.20: [wk02Q3, Exercise, Schedule] Let X Binomial(n, p). Then

n
MX (t) = pet + (1 p)
and let Y Poisson(). Then
MY (t) = e(e 1) = enp(e 1)
t t
where is set to np.
28
1. Now, consider
n
lim MX (t) = lim pet + (1 p)
n n
n
= lim 1 + p et 1
n
t n
= lim 1 + e 1 .
n n
Equivalently, by taking x = 1/n, so that as n , x 0,
1/x
lim MX (t) = lim 1 + x et 1
n x0
= e(e 1) = MY (t) .
t
* using:
x n
lim 1 + = ex
n n
or, equivalently (h = 1/n),
lim (1 + hx)1/h = e x
h0
2. Let Y Poisson() so that:

Pr (Y = x) e x /x!
= (x1) = for x = 1, 2, . . . .
Pr (Y = x 1) e / (x 1)! x
Similarly, for X Binomial(n, p), we have:
!
n x
p (1 p)nx
Pr (X = x) x
=
Pr (X = x 1)
!
n
p x1 (1 p)nx+1
x1
(n x + 1) p
= .
x 1 p
(n)
* using nk = (nk)!k!
n!
, thus: ( nx ) = n!
(nx)!x!
(nx+1)!(x1)!
n!
= nx+1
x
Now, let = np and let lim p0
x1
i.e., let p be small. We have:
(n x + 1) p np xp + p np p (x 1)
lim = lim = = lim
p0 x 1 p p0 x px p0 x px
np p(x 1) np
= lim = lim = .
p0 (1 p)x p0 x x
** using lim p0 np = , lim p0 (1 p) = 1, and lim p0 p(x 1) = 0. Note that:
Pr (Y = 0) = e
and that, using the p.m.f. of the Binomial distribution we have:
!
n
Pr (X = 0) = p0 (1 p)n = (1 p)n
0
n
= 1
n
so that n
lim 1 = e Pr (Y = 0) .
n n
29
1
3. n = 300 10 = 3, 000 and p = . Then X Binomial(n = 3, 000, p = 1/400) and =
400
E[X] = np = 7.5. Approximating using Y Poisson( = 7.5), we have:
Pr (X > 3) Pr (Y > 3) = 1 Pr(Y 3)

= 1 [Pr (Y = 0) + Pr (Y = 1) + Pr (Y = 2) + Pr(Y = 3)]
7.5 (7.5)2 7.53
!
= 1e 7.5
1+ + + = 0.940 854 5.
1! 2! 3!
Doing the Poisson approximation in R you would use:
1-sum(dpois(0:3,7.5))
The true value is given by:
1-sum(dbinom(0:3,3000,(1/400)))
which gives 0.941 073 3.
Solution 1.21: [wk02Q4, Exercise, Schedule] Let X = number of claims over $100. Then X
Binomial(n = 200, p = 0.05).
!
200
1. Pr (X = 0) = (0.05)0 (0.95)200 = 0.0000351.
0
! !
200 200
2. Pr (X = 0)+Pr (X = 1)+Pr (X = 2) = 0.0000351+ (0.05) (0.95) +
1 199
(0.05)2 (0.95)198 =
1 2
0.0023363.
3. E [5 (n X)] = 5 E [(n X)] = 5 (n E [X]) = 5 (200 200 0.05) = 950
Solution 1.22: [wk02Q5, Exercise, Schedule] We are given X Gamma(, ) so that the p.d.f. has
the form
x1 ex
fX (x) = , for x 0.
()
1. The m.g.f. is:

h i Z x1 ex
MX (t) = E eXt = e xt dx
()
Z 10 x(t)
x e
= dx
0 ()
( t) x1 ex(t)
Z

= ( t)
dx
0 ()
| {z }
density of a Gamma(,t)
!

=
t
and this will exist provided t > 0 or equivalently t < .
30
2. To find expression for higher moment,

x1 ex
Z
E [X ] =
r
xr dx
0 ()
Z r+1 x
x e
= dx
0 ()
x
Z r+ r+1 x
1 e
= (r + ) r
dx
() 0 (r + )
| {z }
density of a Gamma(r+,)
(r + )
= r .
()
Solution 1.23: [wk02Q6, Exercise, Schedule] Returns are normally distributed. Use the standard
normal table to get the desired probabilities, but only after standardising, i.e., Z = X

.
1. The probability that investment will be below 1, 000 is:
!
0 0.05
Pr (1000 (1 + RA ) < 1000) = Pr (RA < 0) = Pr ZA <
0.1
= Pr (ZA < 0.16) = 1 (0.16)
= 1 0.5636 = 0.4364.
2. The probability that investment will exceed 1, 200 is:
!
0.2 0.05
Pr (1000 (1 + RA ) > 1 200) = Pr (RA > 0.2) = Pr ZA >
0.1
= Pr (ZA > 0.47) = 1 (0.47)
= 1 0.6808 = 0.3192.
Similar procedures above under investment B but the mean and variance are different. You can
verify that for part (c), the probability is 0.4443 and for part (d), the probability is 0.4443.
Solution 1.24: [wk02Q7, Exercise, Schedule] Let T 1 and T 2 be the time until the next accident at
each of the busy intersections. Then T 1 Exp (2) and T 2 Exp (2.5).
1. The probability that there are no accidents at either intersections in the next month is:

Pr (T 1 > 1 T 2 > 1) = Pr (T 1 > 1) Pr (T 2 > 1)
= (1 Pr (T 1 1)) (1 Pr (T 2 1))
= (1 FT1 (1)) (1 FT2 (1))

= (1 1 e2 (1 1 e2.5 )
= e2 e2.5 = e4.5 = 0.0111.
* using independence between T 1 and T 2 .
2. The probability that there will be no accidents for at least one of the intersections in the next
month is:
Pr (T 1 > 1 T 2 > 1) = Pr (T 1 > 1) + Pr (T 2 > 1) Pr (T 1 > 1 T 2 > 1)
= (1 Pr (T 1 1)) + (1 Pr (T 2 1)) Pr (T 1 > 1 T 2 > 1)
= (1 FT1 (1)) + (1 FT2 (1)) Pr (T 1 > 1 T 2 > 1)

= (1 1 e2 ) + (1 1 e2.5 ) Pr (T 1 > 1 T 2 > 1)
= e2 + e2.5 e4.5 = 0.2063.
31
1. Note that:

+1
Z Z Z
E [X] = x fX (x)dx = x dx = x dx
x
" +1 #
x
= =
1 1
and that
+1
2
Z Z Z

h i
E X 2
= x fX (x)dx =
2
x dx = x+1 dx
x
" +2 #
x 2
= = .
2 2
Thus, the variance is given by:
2 2 2
Var (X) = = .
2 1 ( 1)2 ( 2)
2. The distribution function of X is given by:

Z x +1 " # x

Z x
u
F X (x) = fX (u)du = du =
u

!
1 1
= =1 for x >
x x
We have that F X (x) = 0 for x . Hence, the quantile function of X is given by solving

u = 1 F 1(u)
X

F X1 (u) = ,
(1 u)1/
where u [0, 1].
3. The median is therefore:

M = 21/ .
4. Define X to be the loss amount and Y, the amount of a claim. Thus,

(
X 5, if X 5;
Y= = max {0, X 5} .
0, otherwise;
Notice that the variable Y is a mixed random variable with a probability

R5 mass at zero, reflecting
the loss amount smaller than or equal to five, i.e., Pr(Y = 0) = 0 f x (x)dx. For loss amounts
larger than five, the density is of the loss amount equals the density of the claim amount in
y = x 5.
The expected loss amount is:
3.5 (4)
E [X] = = 5.6
3.5 1
32
and the expected amount of a single claim is:

Z Z Z !4.5
3.5 4
E [Y] = y fY (y)dy = (x 5) fX (x)dx = (x 5) dx
0 5 5 4 x
Z
= 0.875 44.5 x3.5 5x4.5 dx
" 2.5 5 3.5 # " 2.5
(5)3.5
#
x x (5)
= 448 5 = 448 5
2.5 3.5 5 2.5 3.5
= 0.9159
and the variance of a single claim is:

h i h i Z
Var (Y) = E (Y E (Y)) = E Y (0.9159) =
2 2 2
y2 fY (y)dy 0.8389
0
Z
= (x 5)2 fX (x)dx 0.8389
5
Z
= 0.875 44.5 (x 5)2 x4.5 dx 0.8389
5
Z
= 0.875 44.5 x2 + 25 10x x4.5 dx 0.8389
5
Z
= 0.875 44.5 x2.5 + 25x4.5 10x3.5 dx 0.8389
5
" x1.5 25x3.5 10x2.5 #
= 0.875 44.5 + 0.8389
1.5 3.5 2.5 5
(5)1.5 (5)2.5 2 (5)
3.5 !
= 448 2 (5) + (5) 0.8389
1.5 2.5 3.5
= 5.267
Solution 1.26: [wk03Q4, Exercise, Schedule] First we derive the marginals:
x 0 1 2 3
Pr (X = x) = y Pr(X = x, Y = y)
P
0.25 0.35 0.27 0.13
and
y 1 2
Pr (Y = y) = x Pr(X = x, Y = y)
P
0.45 0.55
The conditional probability functions are:
x 0 1 2 3
Pr (X |Y = 1) = Pr(X=x,Y=y)
Pr(Y=y)
1/9 4/9 3/9 1/9
and
y 1 2
Pr (Y |X = 3) = Pr(X=x,Y=y)
Pr(X=x)
5/13 8/13
1. E [X] = x Pr(X = x) = 0(0.25) + 1(0.35) + 2(0.27) + 3(0.13) = 1.28

P
x
2. E [Y] = y Pr(Y = y) = 1(0.45) + 2(0.55) = 1.55

P
y
3. E [X |Y = 1] = x x Pr(X = x|Y = 1) = 0(1/9) + 1(4/9) + 2(3/9) + 3(1/9) = 13/9

P
33
h i h i P
4. Var (Y |X = 3 ) = E Y 2 |X = 3 (E [Y |X = 3])2 = 40/169, where E Y 2 |X = 3 = y y2 Pr(Y =
y|X = 3) = 12 5/13 + 22 8/13 = 37/13 and E [Y |X = 3] = y y Pr(Y = y|X = 3) =
P
1 5/13 + 2 8/13 = 21/13
5. E[XY] = x y xy Pr(X = x, Y = y) = 0 1 0.05 + 1 1 0.20 + 2 1 0.15 + 3 1 0.05 + 0

P P
2 0.20 + 1 2 0.15 + 2 2 0.12 + 3 2 0.08 = 1.91 and Cov(X, Y) = E[XY] E[X] E[Y] =
1.91 1.28 1.55 = 0.074.
Solution 1.27: [wk03Q5, Exercise, Schedule] The marginals can be obtained using:
X X
pX (x) = p (x, y) and pY (y) = p (x, y)
y x
and the conditional probability mass functions using:

p (x, 2) p (2, y)
pX|Y (X |Y = 2) = and pY|X (Y |X = 2) = .
pY (2) pX (2)
1. The marginal probability mass functions for X and Y, respectively:
x/y 1 2 3 4
pX (x) = Pr (X = x) 0.19 0.32 0.31 0.18
pY (y) = Pr (Y = y) 0.19 0.32 0.31 0.18
2. The conditional probability mass functions are:
x/y 1 2 3 4
Pr (X |Y = 2) 5/32 20/32 5/32 2/32
Pr (Y |X = 2) 5/32 20/32 5/32 2/32
3. We have E[XY] = x y x y p(x, y) = 0.1 1 1 + 0.05 2 10.02 3 1 + 0.02 4 1 + 0.05 1

P P
2 + 0.20 2 20.05 3 2 + 0.02 4 2 + 0.02 1 3 + 0.05 2 30.20 3 3 + 0.04 4 3 + 0.02 1
4 + 0.02 2 4 + 0.04 3 4 + 0.10 4 4 = 0.34 + 1.36 + 2.64 + 2.32 = 6.66.
We have Cov(X, Y) = E[XY] E[X] E[Y] = 6.66 2.48 2.48 = 0.5096, where E[X] =
x pX (x) = 1 0.19 + 2 0.32 + 3 0.31 + 4 0.18 = 2.48 and E[Y] = y pY (y) = 1 0.19 + 2
P P
0.32 + 3 0.31 + 4 0.18 = 2.48.
1. Rewrite fX,Y (x, y) = 67 x2 + 67 y2 + 12

7
xy for 0 x, y 1. We have that:
i.
Z Z y Z 1Z y
6 2 6 2 12
Pr (X < Y) = fX,Y (x, y)dxdy = x + y + xydxdy
0 0 7 7 7
Z 1" #y Z 1
6 3 6 2 12 2 3 6 3 6 3
= x + y x + x2 y dy = y + y + y dy
0 21 7 14 0 0 7 7 7
Z 1 " 4 #1
14 3 y 1
= y dy = 2 = .
0 7 4 0 2
34
ii. Use X + Y 1 X 1 Y:
Z Z 1y Z 1 Z 1y
6 2 6 2 12
Pr (X + Y 1) = fX,Y (x, y)dxdy = x + y + xydxdy
0 0 7 7 7
Z 1" #1y
6 3 6 2 12
= x + y x + x2 y dy
0 21 7 14 0
Z 1
6 6 12
= (1 y)3 + (1 y)y2 + (1 y)2 ydy
0 21 7 14
Z 0
6 6 12
= (z)3 z(1 z)2 (z)2 (1 z)dz
1 21 7 14
Z 1 ! !
6 6 12 12 2 6 6
= + (z) +
3
(z)2 + (z)dz
0 21 7 14 14 7 7
Z 1 ! !
2 6 6
= (z)3 (z)2 + (z)dz
0 7 7 7
" #1
2 4 6 3 3 2 3
= z z + z = ,
74 73 7 0 14
* using z = 1 y, dz = 1dy.
iii.
Z 1/2 Z Z 1/2 Z 1
6 2 6 2 12
Pr (X 1/2) = fX,Y (x, y)dydx =x + y + xydydx
0 0 7 7 7
Z 1/2 " #1 Z 1/2
6 2 6 12 6 2 6 12
= x y + y3 + xy2 dx = x + + xdx
0 7 21 14 0 0 7 21 14
" #1/2
6 3 6 6 2
= x + x + x2 = .
21 21 14 0 7
2. Rewriting fX.Y (x, y) = 76 (x2 + y2 + 2xy), we have the following marginal densities:
Z Z 1
6 2
fX (x) = fX,Y (x, y)dy = (x + y2 + 2xy)dy
0 7
" #1
6 2 1 3 1 2 2 2
= x y + y + xy = 3x + 3x + 1 for 0 x 1,
7 3 2 0 7
and zero otherwise, and
Z Z 1
6 2
fY (y) = fX,Y (x, y)dx =
(x + y2 + 2xy)dx
0 7
" #1
6 2 1 3 1 2 2 2
= y x + x + yx = 3y + 3y + 1 for 0 y 1,
7 3 2 0 7
and zero otherwise.
3. You can also immediately check the following conditional densities:
fX,Y (x, y) 3 (x + y)2
fY|X (y |x ) = = 2 for 0 y 1,
fX (x) 3x + 3x + 1
fX,Y (x, y) 3 (x + y)2
fX|Y (x |y) = = 2 for 0 x 1,
fY (y) 3y + 3y + 1
and zero otherwise.
35
Solution 1.29: [wk03Q8, Exercise, Schedule] Let xn and s2n denote the sample mean and variance
for the sample x1 , x2 , . . . , xn . Let xn+1 and s2n+1 denote these quantities when an additional observation
xn+1 is added to the sample.
1. To show xn+1 can be computed from xn and xn+1 , note that

n+1
n
1 X 1 X nxn + xn+1
xn+1 = xk = xk + xn+1 = .
n + 1 k=1 n + 1 k=1 n+1

2. Using the result in part (a), we start with

n+1 n+1 !2
1 X 1X nxn + xn+1
s2n+1 = (xk xn+1 ) = 2
xk
n + 1 1 k=1 n k=1 n+1
n+1 2
((n + 1) 1)xn + xn+1
!
1X
= xk +
n k=1 n+1
n+1 !2
1X xn xn+1
= (xk xn ) +
n k=1 n+1
n+1 ! !2
1 X xn xn+1 xn xn+1
= (xk xn ) + 2 (xk xn )
2
+
n+1 n+1

n k=1
n
1 X
1
= (xk xn )2 + (xn+1 xn )2
nk=1
n
n+1 ! n+1 !2
X2 xn xn+1 1 X xn xn+1
+ (xk xn ) +
n k=1 n+1 n k=1 n+1
n
1 X 1
= (xk xn )2 + (xn+1 xn )2
n
k=1
n
! n+1
2 xn xn+1 X n + 1 (xn xn+1 )2
+ (xk xn ) +
n n+1 n (n + 1)2
n k=1
1 X 1
= (xk xn )2 + (xn+1 xn )2
n
k=1
n
! n+1
2 xn xn+1 X 1 (xn xn+1 )2
+ (xk xn ) +
n n+1 k=1
n n+1
= 1 n+1 and ** using n+1 k=1 (xk xn ) =

2
k=1 (xk xn ) + (xn+1 xn ) . From the
2 2
n 1 P Pn
* using n+1
last part of the above equation, the first term can be shown to be equal to:
n
1X n1 2
(xk xn )2 = s
n k=1 n n
while the second term can be shown to be equal to:

1
(xn+1 xn )2 .
n+1
36
Proceed as follows. Note n+1 k=1 (xk xn ) = xn+1 x, using k=1 xk xn = 0, then the second term
P Pn

n+1
!X 2

1 xn xn+1 (xn xn+1 )
(xn+1 xn ) + 2
2
(xk xn ) +
n+1 +

n n 1
|k=1 {z }
=xn+1 x
2
(xn xn+1 )2
" #
1 2(xn+1 xn )
= (xn+1 xn )2 +
n n+1 n+1
2
" #
1 (xn+1 xn )
= (xn+1 xn )2
n n+1
1 n
= (xn+1 xn )2
n n+1
(xn+1 xn )2
= ,
n+1
as required, *** using (a b)2 = ((a b))2 = (b a)2 .
xn )2
Hence, we have s2n+1 = n1 s2 + (xn+1n+1
n n
.
Solution 1.30: [wk03Q9, Exercise, Schedule] Starting with the right-hand side, we have:
Z Z Z
E [Y |X = x ] fX (x) dx = y fY|X (y |x ) fX (x) dydx

Z Z
f (x, y)
= y fX (x) dydx
fX (x)
Z Z

= y f (x, y) dydx
Z Z

= y f (x, y) dxdy

Z Z
= y f (x, y) dxdy

| {z }
= fY (y)
Z
= y fY (y) dy = E [Y] .

Solution 1.31: [wk03Q10, Exercise, Schedule] X1 Uniform[0, 1] implies that fX1 (x1 ) = 1 for
0 x1 1, and zero otherwise, and conditional on X1 , X2 Uniform[0, X1 ] implies
1
fX2 |X1 (x2 |x1 ) = , for 0 x2 x1 1,
x1
and zero otherwise.
1. Thus, the joint density of X1 and X2 is given by:
f (x1 , x2 ) = fX2 |X1 (x2 |x1 ) fX1 (x1 )

1
= , for 0 x2 x1 1,
x1
and zero otherwise.
37
The joint distribution function for 0 x2 x1 1 is:

Z x2 Z x1 Z x2 Z x1
1
F (x1 , x2 ) = fX1 ,X2 (u1 , u2 )du1 du2 = du1 du2
u2 0 u2 u1
Z x2 ! !!
x1 x1
= log du2 = x2 1 + log .
0 u2 x2
Thus we have:
if x2 < 0 or x1 < 0;

0,
x2 1 + log xx21 ,

if 0 x2 x1 1;

F X1 ,X2 (x1 , x2 ) =

x1 1 + log xx11 = x1 , if 1 > x2 > x1 > 0;

1, else.

2. The marginal density of X2 is:

Z 1 Z 1
fX2 (x2 ) = f (x1 , x2 ) dx1 = f (x1 , x2 ) dx1
x2
Z 1
1
= dx1 = log(x2 ), for 0 < x2 < 1
x2 x1
and zero otherwise, and hence the marginal distribution function is:
if x2 < 0;

0,
R x2
F X2 (x2 ) = log(u2 )du2 = u2 log(u2 ) u2 0 = x2 log(x2 ) + x2 , if 0 x2 1;
x2
0
if x2 > 1.

1,

Solution 1.32: [wk03Q11, Exercise, Schedule] First note that we can re-express the joint distribution
function as:
3 1 1 1
F (x1 , x2 ) = x1 x2 x12 x2 x1 x22 + x12 x22 .
2 2 2 2
1. The joint density can be derived by taking the partial derivative twice:
2 F (x1 , x2 ) 3
f (x1 , x2 ) = = x1 x2 + 2x1 x2
x1 x2 2
if 0 x1 , x2 1 and zero otherwise.
2. The marginal densities are:

F(x1 , 1)
fX1 (x1 ) = = 3/2 x1 + 1/2 + x1 = 1, for 0 x1 1
x1
F(1, x2 )
fX2 (x2 ) = = 3/2 1/2 + x2 + x2 = 1, for 0 x2 1
x2
and zero otherwise, so that the marginal distribution functions are:
if x1 < 0;

0,
Z x1

R x1
F (x1 ) = fX1 (u1 )du1 = 1du1 = [u1 ]0 = x1 , if 0 x1 1;
x1

0

if x1 > 1.

1,

38
and
if x2 < 0;

Z x2

0,
R x2
F (x2 ) = fX2 (u2 )du2 = 1du2 = [u2 ]0 = x2 , if 0 x2 1;
x2

0

if x2 > 1.

1,

They are both uniform on [0, 1] .
3. To derive the correlation, we first note that:

1
E [X1 ] = E [X2 ] =
2
and
1
Var (X1 ) = Var (X2 ) = .
12
Furthermore, we have that:
Z Z
E [X1 X2 ] = x1 x2 fX1 ,2 (x1 , x2 )dx1 dx2

Z 1Z 1 !
3
= x1 x2 x1 x2 + 2x1 x2 dx1 dx2
0 0 2
Z 1Z 1
3
= x1 x2 x12 x2 x1 x22 + 2x12 x22 dx1 dx2
0 0 2
Z 1" #1
3 2 1 3 1 2 2 2 3 2
= x x2 x1 x2 x1 x2 + x1 x2 dx2
0 4 1 3 2 3 0
Z 1
3 1 1 2
= x2 x2 x22 + x22 dx2
0 4 3 2 3
" #1
3 2 1 2 1 3 2 3
= x x x + x
8 2 6 2 6 2 9 2 0
19
= .
72
Therefore,
Cov (X1 , X2 ) = E [X1 X2 ] E [X1 ] E [X2 ]

19 1 1
= =
72 4 72
so that
Cov (X1 , X2 ) 1/72 1
(X1 , X2 ) = = = .
Var (X1 ) Var (X2 ) 1/12 6
Solution 1.33: [wk03Q12, Exercise, Schedule] We have the joint probability density function:
(
k(1 x2 ), if 0 x1 x2 1;
fX1 ,X2 (x1 , x2 ) =
0, else.
1. For fX1 ,X2 (x1 , x2 ) to be a (joint) probability density function, the function should satisfy the two
conditions:
1) RfX1 ,XR2 (x1 , x2 ) 0 for all x1 , x2

2) fX1 ,X2 (x1 , x2 ) dx1 dx2 = 1.
39
1) is satisfied if k 0.
For the second condition we calculate:
Z Z Z 1Z 1
fX1 ,X2 (x1 , x2 ) dx1 dx2 = k(1 x2 )dx2 dx1
0 x1
#1
x2
Z 1"
=k x2 2 dx1
0 2 x1
1
x12
Z
1
=k x1 + dx1
0 2 2
" 2 3 #1
x1 x1 x1
=k +
2 2 6
! 0
1 1 1 1
=k + =k .
2 2 6 6
Thus, equating k/6 equal to 1 implies that k = 6.
2. To determine the region for the integral for determining Pr(X1 3/4, X2 1/2) we have three
conditions, namely:
0 X1 X2 1: part above the red-dotted line;

X1 3/4: part left of blue-dashed line;
X2 1/2: part above black-solid line.
Hence, the upper left part of the figure is the region which we integrate.
1 1
0.8 0.8
0.6 0.6
2
x2
x
0.4 0.4
0.2 0.2
0 0
0 0.5 1 0 0.5 1
x1 x1
3. To calculate Pr(X1 3/4, X2 1/2) we split the integral into three part, namely:
0 X1 1/2, and 0 X2 1/2: black part;

1/2 X1 3/4, and 3/4 X2 1: blue part;
40
1/2 X1 X2 3/4: red part;
Thus we have:
Z 1 Z 3/4
Pr(X1 3/4, X2 1/2) = fX1 ,X2 (x1 , x2 ) dx1 dx2
1/2 0
Z 1 Z 1/2 Z 1 Z 3/4
= k(1 x2 )dx1 dx2 + k(1 x2 )dx1 dx2
1/2 0 3/4 1/2
Z 3/4 Z 3/4
+ k(1 x2 )dx2 dx1
1/2 x1
Z 1 Z 1
= [k(1 x2 )x1 ]1/2
0 dx2 + [k(1 x2 )x1 ]3/4
1/2 dx2
1/2 3/4
#3/4
3/4
x22
Z "
+k x2 dx1
1/2 2 x1
Z 1 Z 1
k k
= (1 x2 )dx2 + (1 x2 )dx2
1/2 2 3/4 4
x12
Z 3/4
15
+k x1 + dx1
1/2 32 2
1 !#1
x22 x22
" !# "
k k
= x2 + x2
2 2 1/2 4 2 3/4
3/4
15x1 x12 x13
" #
+k +
32 2 6 1/2
! !
k 1 3 k 1 15
= +
2 2 8 4 2 32
!
9 25 3 3 4 31
+k = + + = = 0.484375.
64 192 8 64 64 64
Solution 1.34: [wk03Q1, Exercise, Schedule] The data arranged in ascending order:
670 1880 3110 4270

850 1940 3320 4630
960 2380 3380 4670
1220 2430 3710 5400
1490 2490 3880 6780
1590 2710 4210 8230
1. Stem-and-Leaf Display of the claim amounts:
> stem(storm,scale=2)
The decimal point is 3 digit(s) to the right of the |
0 | 79
1 | 025699
2 | 4457
3 | 13479
4 | 2367
41
5 | 4
6 | 8
7 |
8 | 2
2. Mean = 3 175, Median = 2710 + 1/2 (3110 2710) = 2 910. Stem-and-leaf display appears
to show a positively-skewed distribution.
3. Q1 = 1 807.5 and Q3 = 4 225. Therefore IQR = Q3 Q1 = 2 417.5.
24
4. F24 (1 000) = 1
I (xk 1 000) = 3
= 18 .
P
24 24
k=1
Solution 1.35: [wk03Q2, Exercise, Schedule] Sample statistics:
1. Mode = 2
2. Median = 2
3. Q1 = 1 and Q3 = 3 so that IQR = 3 1 = 2.
4. Since the average value of 5 claims or more is 7.5, then the sum of claims of 5 or more is
(5 7.5) = 37.5. Therefore
1 X 0 (14) + 1 (25) + 2 (26) + 3 (18) + 4 (18) + 37.5
x= xk = = 2.165.
100 100
Solution 1.36: [wk03Q3, Exercise, Schedule] Recall the formulas for the sample mean and variance:
1X 1 X 1 X 2
x= xk and s2 = (xk x)2 = xk nx2 .
n n1 n1
q
1. x = 32 (13, 337.6) = 416.8 and s = 31
1 1
5667388.7 32 (416.8)2 = 3492.8071 = 59.1.
2. Let xnew and s2new be the new mean and variance respectively after deleting the largest observa-
tion. Thus,
1
xnew = (13337.6 605) = 410.73
31
and
1 h i
s2new = 5667388.7 6052 31 (410.73)2 = 2389.686.
30

Therefore, snew = 2389.686 = 48.9.
3. Percentage change in the mean = newold
old
100% = 410.73416.8
416.8
100% = 1.46%.
4. Percentage change in the standard deviation = newold
old
100% = 48.959.1
59.1
100% = 17.26%.
Solution 1.37: [wk03Q7, Exercise, Schedule] We are given that Z = X + (1 ) Y.
1. Therefore,
E [Z] = E [X + (1 ) Y]
= E [X] + (1 ) E [Y]
= + (1 ) = .
42
2. First, we note that we can express the variance as:
Var (Z) = Var (X + (1 ) Y)

= Var (X) + Var ((1 ) Y) + 2Cov (X, (1 ) Y)
= 2 Var (X) + (1 )2 Var (Y) + 2 (1 ) Cov (X, Y)
| {z }
=0, by independence
= 2
2X + (1 )
2
2Y
Taking the first order condition (FOC) with respect to , i.e., differentiating with respect to
and then equating to zero, we have:

Var (Z) = 22X 2 (1 ) 2Y = 0,

which gives us:
2
22X = 2(1 )2Y = 2Y
1 X
2Y
= 2 .
X + 2Y
You must check for second derivative to ensure this gives the minimum!
X+Y
3. 2
is better than either X or Y if it has smaller variance than both of them, i.e.,
X + Y X + Y
Var < Var (X) and Var < Var (Y) .
2 2
Equivalently,
1 2 1 2
X + 2Y < 2X and X + 2Y < 2Y
4 4
2Y < 32X and 2X < 32Y
2X 1 2X
> and 2 < 3.
2Y 3 Y
Thus, it is better than either X or Y if
1 2X
< < 3.
3 2Y
4. If there is a covariance, we have:
Var(Z) = 2 2X + (1 )2 2Y + 2 (1 )X,Y
Hence, we have the FOC:

Var(Z) =22X 2(1 )2Y + 2(1 2)X,Y = 0

22X 2X,Y = 2(1 )2Y 2(1 )X,Y
2 X,Y
= 2Y ,
1 X X,Y
43
thus we have:
2Y XY
= .
2X + 2Y 2XY
Note: second derivative:

2
Var(Z) =22X + 22Y 4X,Y
2
=2 2X + 2Y 2 X,Y
=2 (Var(X Y)) 0,
2Y XY
Thus, = X +2Y 2XY
2 is indeed the minimum.
Solution 1.38: [wk04Q1, Exercise, Schedule] We are given X Exp() so that:

1 1
E [X] = and Var (X) = .
2
Also, N Poisson() so that:
E [N] = and Var (N) = .
1. The mean of S is:
E [S ] = E [E [S |N ]] = E [NX] = E [N] E [X] = /.
2. The variance of S is:
Var (S ) = E [N] Var (X) + (E [X])2 Var (N)

!2
1 1
= 2+ = 2/2 .

3. The m.g.f. of S is:

MS (t) = MN log MX (t) ,

where

MN (t) = e(e 1) .
t
MX (t) = and
t
Thus,

!
e ( ) 1
log t t
MS (t) = e = exp .
t
Solution 1.39: [wk04Q2, Exercise, Schedule] Xk Exp(1) implies that fXk (x) = ex for x 0 and
zero otherwise, for k = 1, 2, 3. We have that:
if x < 0;
(
0,
F Xk (x) =
1 ex , if x 0,
for k = 1, 2, 3.
Let X(1) = min {X1 , X2 , X3 } and X(3) = max {X1 , X2 , X3 }. Finding the distributions of the minimum and
the maximum, we have:
F X(1) (x) = 1 (1 F (x))3 = 1 e3x ,
44
for x 0 and zero otherwise. So that:
fX(1) (x) = 3e3x , for x 0,
and zero otherwise. This is Exp(3) and
F X(3) (x) = (F (x))3 = 1 ex 3 ,

fX(3) (x) = 3ex 1 ex 2 ,

for x 0,
and zero otherwise.
1. The joint distribution of X(1) , X(2) , X(3) is given by:

fX(1) ,X(2) ,X(3) (y1 , y2 , y3 ) = 3! f (y1 ) f (y2 ) f (y3 )

= 6e(y1 +y2 +y3 ) , for 0 y1 y2 y3 < .
and zero otherwise. Therefore, we get the joint distribution of X(1) , X(3) by integrating over all

possible values of X(2) as:
Z y3 i y3
6e(y1 +y2 +y3 ) dy2 = 6 e(y1 +y2 +y3 )
h
fX(1) ,X(3) (y1 , y3 ) =
y1
y1

= 6 e2y1 y3 ey1 2y3 , for 0 y1 y3 < ,
and zero otherwise.
2. We can show that:

Z
E X(1) =3

x exp(3x)dx
"0 #
exp(3x) 1
=3 (3x 1) =
9 0 3
h exp(cx) i
x exp(cx)dx = (cx 1) , and (note exp(a)b = exp(a b)):
R
* using c2
Z
E X(3) =3 y exp(y)(1 exp(y))2 dy

Z0
=3 y exp(y) 2y exp(2y) + y exp(3y)dy
0
" #
exp(y) exp(2y) exp(3y)
=3 (y 1) 2 (2y 1) + (3y 1)
1 4 9 0
11
=3 (1 1/2 + 1/9) = .
6
45
3. We have that (note exp(a)b = exp(a b)):

Z
h i 1
Var X(1) =E =3
2 2
x2 exp(3x)dx

X(1) E X(1)
0 9
" 2
!#
x 2x 2 1 1
=3 exp(3x) + =
3 9 27 0 9 9
Z !2
h i 11
Var X(3) =E X(3) E X(3) = 3
2 2 2 2
y exp(y)(1 exp(y)) dy
0 6
Z !2
11
=3 y2 exp(y) 2y2 exp(2y) + y2 exp(3y)dy
6
"0 2
! 2
!
y 2y 2 y 2y 2
=3 exp(y) + 2 exp(2y) +
1 1 1 2 4 8
2
!# !2
y 2y 2 11
+ exp(3y) +
3 9 27 0 6
! !2
1 2 11 49
=3 (2) + 2 = ,
4 27 6 36
h 2 i
** using x2 exp(cx)dx = exp(cx) xc 2x + c23 .
R
c2
4. Now, for:
Z Z Z Z
E X(1) X(3) = xy fX,Y (x, y)dydx = xy 6e(x+y) ex ey dydx

Z0 x
Z
0 x
= e2x yey ex ye2y dydx

6x
0
Z "x !#
ey e2y
!

= 6x e 2x
(y 1) e x
(2y 1) dx
0 1 4 x
Z
ex e2x
! !!
= 6x e 2x
(x + 1) e x
(2x + 1) dx
0 1 4
Z
12 6
= 6x2 e3x + 6xe3x x2 e3x xe3x dx
4 4
Z0
9
= 3x2 e3x + xe3x dx
0 2
!# #
2
9 e3x
" "
& x 2x 2
= 3 e 3x
+ + (3x 1)
3 9 27 0 2 9 0
3 2 1 13
= + = .
27 2 18
h i
again * using x exp(cx)dx = exp(cx)
R
c2
(cx 1) ,
h 2
i
** using x2 exp(cx)dx = exp(cx) c c2 + c23 .
R x 2x
Therefore, we have:
Cov X(1) , X(3) = E X(1) X(3) E X(1) E X(3)

! !
13 1 11 1
= = .
18 3 6 9
46
Therefore, the required correlation coefficient is:
Cov X(1) , X(3)

X(1) , X(3) =

p
Var X(1) Var X(3)
1/9 2
= = .
(1/9) (49/36) 7
Solution 1.40: [wk04Q3, Exercise, Schedule] X Gamma(, 1) implies:

!
x1 ex 1
fX (x) = , for x 0 and MX (t) = .
() 1t
Similarly, Y Gamma(, 1)implies:
!
y1 ey 1
fY (y) = , for y 0 and MY (t) = .
() 1t
1. Using the m.g.f. technique, we have:

h i h i h i h i
MU (t) = E eUt = E e(X+Y)t = E eXt E eYt
!+
1
= MX (t) MY (t) =
1t
which is the m.g.f. of a Gamma( + , 1).
2. By independence, we note that:
x1 ex y1 ey
f (x, y) = fX (x) fY (y) = .
() ()
The inverse of the transformation:
x
u= x+y and v=
x+y
is given by:
x = uv and y = u uv = u (1 v) .
Which is derived by: x = u y v = uy
u
uv = u y y = (v 1)u y = (1 v)u
x = u u(1 v) x = uv.
Its jacobian is:
h1 (u, v) /u h1 (u, v) /v

v
u
J (u, v) = det = det

h2 (u, v) /u h2 (u, v) /v 1 v u

= uv u(1 v) = u.
Thus |J (u, v) | = u, because 0 < u < . By the Jacobian transformation technique, the joint
density of U and V is:
1 (uv)1 euv [u (1 v)]1 eu(1v)

fUV (u, v) =
1/u () ()
for 0 < u < and 0 < v < 1 and zero otherwise.
47
3. Use euv eu(1v) = eu , than we can further simplify the joint density as:
1 1
fUV (u, v) = u|+1 u 1
{ze } v| (1{z v)1 .
() () }
| {z } function of u alone function of v alone
constant
Thus, we see that we can express the joint density as a product of functions of u alone and v
alone, i.e., fU,V (u, v) = fU (u) fV (v). Therefore, U and V are independent.
4. Note x, y 0, thus 0 X
X+Y
X
X
= 1. For the marginal of U, we have
u+1 eu v1 (1 v)1
1
Z
fU (u) = dv
0 () ()
u+1 eu 1 ( + ) 1
Z
= v (1 v)1 dv
( + ) 0 () ()
| {z }
density of a Beta(,) =1
(+)1 u
u e
= , for u > 0
( + )
and zero otherwise. This is the density of a Gamma( + , 1). This reinforces the result in (a).
Note: 0 X + Y . For the marginal of V, we have:
e v (1 v)1
Z +1 u 1
u
fV (v) = du
0 () ()
Z +1 u
( + ) 1 1 u e
= v (1 v) du
() () ( + )
|0 {z }
density of a Gamma(+,1) =1
( + ) 1
= v (1 v)1 , for 0 < v < 1
() ()
and zero otherwise. This is the density of a Beta(, ).
5. Since X = UV and by independence, we have:
E [X] = E [U] E [V]

= ( + ) E [V]
so that

E [V] = .
+
Similarly, we have for the variance:
h i
Var (X) = Var (UV) = E U 2 V 2 (E [UV])2
h i h i
= E U 2 E V 2 2
so that
h i + 2 + 2
E V 2 = 2 = .
E U ( + ) (1 + + )
48
Thus, the variance of V is:

h i
Var (V) = E V 2 (E [V])2
!2
+ 2
=
( + ) (1 + + ) +
( + ) ( + ) (1 + + )
2 2
=
( + )2 (1 + + )

=
( + ) (1 + + )
2
Solution 1.41: [wk04Q4, Exercise, Schedule] Xk Exp(1) implies that fXk (x) = ex for x 0 and
zero otherwise, for k = 1, 2, 3. We have that:
if x < 0;
(
0,
F Xk (x) =
1 ex , if x 0,
for k = 1, 2, 3.
Let X(1) = min {X1 , X2 , X3 } and X(3) = max {X1 , X2 , X3 }. Finding the distributions of the minimum and
the maximum, we have:
F X(1) (x) = 1 (1 F (x))3 = 1 e3x ,
fX(1) (x) = 3e3x , for x 0,
and zero otherwise. This is Exp(3) and
F X(3) (x) = (F (x))3 = 1 ex 3 ,

fX(3) (x) = 3ex 1 ex 2 ,

for x 0,
and zero otherwise. The joint distribution of X(1) , X(2) , X(3) is given by:

fX(1) ,X(2) ,X(3) (y1 , y2 , y3 ) = 3! f (y1 ) f (y2 ) f (y3 )

= 6e(y1 +y2 +y3 ) , for 0 y1 y2 y3 < .
and zero otherwise. Therefore, we get the joint distribution of X(1) , X(3) by integrating over all

possible values of X(2) as:
Z y3 iy3
6e(y1 +y2 +y3 ) dy2 = 6 e(y1 +y2 +y3 )
h
fX(1) ,X(3) (y1 , y3 ) =
y1
y1

= 6 e2y1 y3 ey1 2y3 , for 0 y1 y3 < ,
and zero otherwise.

1. First, we find the conditional density of X(3) X(1) and by definition,
fX(1) X(3) (y1 , y3 )

fX(3) |X(1) (y3 |y1 ) =
fX (y1 )
(1)
6 e2y1 y3 ey1 2y3
= , for 0 y1 y3 < ,
3e3y1
49
and zero otherwise.

Replacing y1 = x and y3 = y, we have:

fX(3) |X(1) (y |x ) = 2 e(xy) e2(xy) , for 0 x y < ,
and zero otherwise. Thus,

h i Z
E X(3) X(1) = x = 2y e(xy) e2(xy) dy
x
" 2y #
2x e
= 2e e (y 1) x 2e
x y
(2y 1)
4 x
2x
e
= 2e x (x + 1)ex 2e2x (2x + 1)
4
3
= x+ ,
2
h i
ecy
yecy dy =
R
* using c2
(cy 1) .

2. Similarly, we can find the conditional density of X(1) X(3) as:

2 e2y e(y+x)
fX(1) |X(3) (y |x ) = , for 0 y x < ,
(1 ex )2
and zero otherwise. Thus,
h
i Z x 2 e2y e(y+x)
E X(1) X(3) = x = y dy
0 (1 ex )2
" 2x #x !
2 e
= x y x
(2y 1) e e (y 1) 0
(1 ex )2 4 0
2xe2y e2y + 1 + 4xe2x + 4e2x 4ex
=
2(1 ex )2
2y
!
2 e 1
= (2x 1) + + e (x + 1) e
2x x
(1 ex )2 4 4

1 4ex + 3e2x + 2xe2x
= ,
2 (1 ex )2
h cx i
* using xecx dx = ec2 (cx 1) .
R
3. Already derived earlier.

4. We use Jacobian transformation by first letting:
R = X(3) X(1) and S = X(1)
with the inverse transformation:
X(1) = S and X(3) = R + S .
The Jacobian of this transformation is given by:

S /R S /S

0 1
J (R, S ) = det = det = 1.

(R + S ) /R (R + S ) /S 1 1

50
Thus, |J (R, S ) | = 1 and

fR,S (r, s) = 6 e2srs es2(r+s)
= 6e3sr 1 er , for 0 s < r + s < ,

and zero otherwise, where the range is equivalently:
0s< and 0 r < .
Thus, the marginal density of R is obtained by integrating all possible values of S :

Z
fR (r) = 6e3sr 1 er ds

0
" #
1 3sr
= 6 1e r
e
3 0
r
e
= 6 1 er

3
= 2er 1 er , for 0 r < ,
and zero otherwise.

Solution 1.42: [wk04Q5, Exercise, Schedule] Use the m.g.f. technique. Recall that if X N , 2 ,
then
MX (t) = et+ 2 t .
1 2 2
1. Let S = X1 + X2 .
MS (t) = E e(X1 +X2 )t = E eX1 t E eX2 t

h i h i h i
1 2 1 2 1 2
= e 2 t e 2 t = e 2 (2)t
which is the m.g.f. of a N (0, 2).
2. Let D = X1 X2 .
h i h i h i
MD (t) = E e(X1 X2 )t = E eX1 t E eX2 (t)
1 2 1 2 1 2
= e 2 t e 2 (t) = e 2 (2)t
which is the m.g.f. of a N (0, 2). Thus, D has the same distribution as S .
3. Now, assume that they are no longer independent and has the bivariate normal density:
1 1 !
fX1 ,X2 (x1 , x2 ) = exp x1 2x1 x2 + x2 .
2 2
2 1 2 2 1 2
p
Using Jacobian transformation technique, we find the joint distribution of S and D. From
S = X1 + X2 and D = X1 X2
the inverse of this transformation is

1 1
X1 = (S + D) and X2 = (S D) ,
2 2
51
which is derived by X1 = S X2 D = S X2 X2 X2 = S D 2
X1 = S S D 2
= S +D
2
. Its
Jacobian is:
(S + D)/2 /S (S + D)/2 /D

1/2 1/2
J (S , D) = det = det = 1/41/4 = 1/2.

(S D)/2 /S (S D)/2 /D 1/2 1/2

Thus |J (S , D) | = 1/2. Therefore,
1 (s + d) 2 2 1 (s + d) 1

1 1
fS ,D (s, d) = exp 2 2
2 1 2 2 1 2 1 (s d) + 1 (s d) 2 |2|
p
2 2
1 1 !
= exp s + d 2 s d + s + d
2 2 2 2 2 2
4 1 2 8 1 2
p
1 1 !
= exp (1 ) s + (1 + ) d
2 2
4 1 2 4 1 2
p
s2 d2
! !
1
= exp exp
4 1 2 4 (1 + ) 4 (1 )
p
Therefore, clearly we can write the density as a product of functions of s alone and d alone. S
and D are therefore independent.
4. We have that the p.d.f. of X is given by:

fX (x) = x1 ex , if x > 0,
()
and zero otherwise.
(a) The transformation g(X) = 1/X is a monotonic decreasing function for x > 0, because
d 1x
dg(x)
d
x = d
x = x2 < 0 for x > 0. Hence, we can apply the CDF technique, with
g1 (Y) 1y
g(Y) = 1/X, g1 (Y) = 1/Y, and y
= y
= y2 < 0, support of Y: g(0) = ,
g() = 0 we have:

fY (y) = fX (g1 (y)) g1 (y)
y

!1
1
= e y y2
() y

= (y)+1 e y y2
()

= y1 e y
()
for y > 0 and zero otherwise.
(b) The c.d.f. of the inverse gamma distribution, as function of the c.d.f. of the gamma
distribution, is given by applying the CDF technique:
FY (y) =1 F X (g1 (y))

=1 F X (1/y).
52
Solution 1.43: [wk05Q14, Exercise, Schedule] The p.d.f. of a chi-squared distribution with one
degree of freedom:
exp(y/2)
fY (y) = p , if y > 0,
2y
and zero otherwise. We need to prove that the moment generating function of Y is given by:
MY (t) = (1 2t)1/2 .

Using the transformation x = 2 y(t 1/2) and thus dy = y1/2 /2 2 (t 1/2)dx we have:
p
Z Z
exp(y/2)
MY (t) = e fY (y)dy =
ty
exp(ty) p dy
0 2y
Z
exp(y (t 1/2))
= p dy
0 2y
exp(x2 /2)
Z
2
= dx
2 (t 1/2) 0 2
2 1
=
2 (t 1/2) 2
1
= = (2 (t + 1/2))0.5 = (1 2t)0.5
2 (t 1/2)
R 2 /2)
* using that 0 exp(x

2
dx is the integral of the p.d.f. of a standard normal distributed random variable
over the positive values of x. Due to the symmetry property of the standard normal distribution in 0,
we have that this integral equals 1/2.
Solution 1.44: [wk05Q15, Exercise, Schedule] We need to prove that:

d
tn1 N(0, 1) as n .
This implies that a tdistribution converges in distribution to a standard normal distribution as n

. Here we cannot use the moment generating function, because it is not defined for a student-t
distribution. Note that the definition of convergence in distribution is:
Xn converges in distribution to the random variable X as n if and only if, for every x:
F Xn (x) F X (x) as n .
This implies that one can use the cumulative density function of the student-t distribution and the
standard normal distribution to prove the convergence. However, these do not have a closed form
expression. Therefore, we will prove that the probability density function of a studentt distribution is
the same as the standard normal one when n . When the probability density function converges,
also the cumulative density function must converge.
53
We have:

n+1
21 x2
!(n+1)/2
lim ft|n (x) = lim 1 +
n n n n n
2
r !(n+1)/2
n 1 x2
= lim 1+
n 2 n n
n/21/2
x2 /2
!
1
= lim 1 +
n 2 n/2
1 1 1
= lim n/2
q
1 + xn/2/2
2

1 + xn/2/2
n 2 2
1 1 1
= 1/2x2 lim q
2 e n
1+ x2 /2
n/2
1 2
= e1/2x ,
2
( n+1 )
which is the probability density function of a standard normal random variable, * using lim ( 2n ) =
n 2
a n

= + q 1 =
pn a
2
, ** using e lim 1 n
, and *** using lim x2 /2
1.
n n 1+ n/2
Solution 1.45: [wk05Q16, Exercise, Schedule] i) Define transformations:

U/n1
F= G = V.
V/n2
ii) Determine the inverse of the transformations:
V =G U = n1 F V/n2 = n1 F G/n2 .
iii) Calculate the absolute value of the Jacobian:

" #
0 1 n1
J = det =g .
g nn12 f nn21 n2
iv) Determine the joint probability density function of F and G:

1 1
fFG ( f, g) = fUV (u, v) = fU (u) fV (v)
|J| |J|
n1 g u(n1 2)/2 v(n2 2)/2
= n1 /2 exp(u/2) n2 /2 exp(v/2)
n2 2 (n1 /2) 2 (n2 /2)
(n1 2)/2
g(n2 2)/2
!
n1 g ( f n1 g/n2 ) f n1 g
= exp exp (g/2)
n2 2n1 /2 (n1 /2) 2n2 2n2 /2 (n2 /2)
(g)(n1 +n2 2)/2
!!
1 f n1 1
= n1 ( f n1 )(n1 2)/2
n1 /2 exp g + n2 /2
n2 2n1 /2 (n1 /2) 2 2n2 2 (n2 /2)
* using independence between U and V, ** using inverse transformation, determined in step ii), and
*** using exp(ga) exp(gb) = exp(g(a + b)) and ab ac = ab+c .
54
v) Calculate the marginal distribution of F by integrating over the other variable:

Z
fF ( f ) = fFG ( f, g)dg
0
Z
( f n1 )(n1 2)/2
!!
1 (n1 +n2 2)/2 1 f n1
= n2 /2 n1 n1 /2 g exp g + dg
2 (n2 /2) n2 2n1 /2 (n1 /2) 0 2 2n2
!(n1 +n2 2)/2
1 ( f n1 )(n1 2)/2 2n2 2n2
= n2 /2 n1 n1 /2
2 (n2 /2) n2 2n1 /2 (n1 /2) n2 + f n1 n2 + f n1
Z
x(n1 +n2 2)/2 exp (x) dx
0
!(n1 +n2 2)/2
1 ( f n1 )(n1 2)/2 2n2 2n2
= n2 /2 n1 n1 /2
2 (n2 /2) n2 2 1 (n1 /2) n2 + f n1
n /2 n2 + f n1
((n1 + n2 )/2)
!(n1 +n2 )/2
1 f (n1 2)/2 n(n1 )/2
2n2 ((n1 + n2 )/2)
= (n1 +n2 )/2 1

2 n2n1 /2 n2 + f n1 (n1 /2) (n2 /2)
(n1 +n2 )/2
f (n1 2)/2 n(n 1 )/2
((n1 + n2 )/2)
!
n2
= 1

n2n1 /2 n2 + f n1 (n1 /2) (n2 /2)
((n1 + n2 )/2)
= f (n1 2)/2 n(n 1 )/2
n(n 2 )/2
(n2 + f n1 )(n1 +n2 )/2
1 2
(n1 /2) (n2 /2)
n1 /21
((n1 + n2 )/2) f
=nn11 /2 nn22 /2
(n1 /2) (n2 /2) (n2 + f n1 )(n1 +n2 )/2

* using transformation x = 21 + 2n f n1
2
g and thus g = 2n2
n2 + f n1
x and we have dx = 1
2
+ f n1
2n2
dg,
R
1
and ** using () = 0 x exp(x)dx.
I. C
A t-distribution is obtained by a standard normal r.v. divided by the square root of a chi-squared
r.v. divided by its degree of freedom.
We have Z1 + Z2 N(0, 2), i.e., Z1+Z2 2 N(0, 1) (see lecture notes).
For a chi-squared distribution we have the m.g.f.: MVi (t) = (1 2 t)ri /2 for i = 1, 2. Hence,
MV1 (t) MV2 (t) = (1 2 t)r1 /2 (1 2 t)r2 /2
= (1 2 t)(r1 +r2 )/2 ,
which is the m.g.f. of a chi-squared distribution with r1 + r2 degrees of freedom. Hence, V1 + V2
has a chi-squared distribution with r1 + r2 degrees of freedom.
II. E
See lecture notes/ previous question.
III. C
We have:
Z1 + Z2 N (0, Var(Z1 ) + Var(Z2 ) + 2Cov(Z1 , Z2 ))
N (0, 1 + 1 + 2 1 1)
N (0, 2 + 2)
N (0, 2 (1 + )) .
55
Z +Z
Thus Var 1 2
2
= 2(1+)
2
= 1 + , 1.
IV. D
We have MXk (t) = (1 t/)1 for k = 1, . . . , n. Let Yk = Xk /n, then we have: MYk (t) =
1
MXk /n (t) = MXk (t/n) = 1 nt for k = 1, . . . , n.
Using the m.g.f. technique we determine the distribution of the sample mean by the m.g.f.:
MX (t) =MY1 (t) . . . MYn (t)
t n
= 1
n
which is the m.g.f. of a Gamma distribution with parameters n and n.
V D
Use the m.g.f. technique. MXk (t) = exp((exp(t) 1)) for k = 1, . . . , n. We have:
MS (t) =MX1 (t) . . . MXn (t)
Y
= n exp((exp(t) 1))
k=1
= exp((exp(t) 1))n
= exp(n (exp(t) 1)),
which is the m.g.f. of a Poisson r.v. with mean n.
VI. B
We have:
Pr X(20) > 20 =1 Pr X(20) 1

=1 (F X (1))20
=1 1 exp(2) .
20
VII. D
We have:

Z x

0, h ix if x 0;
Rx
F X (x) = fX (x)dx = 2xdx = x = x , if 0 < x < 1;
2 2
0 0
0
1, if x 1.

Let U = X(n) , then:

fU (u) =n fX (u) (F X (u))n1
=n 2u u2(n1) ,
for u [0, 1] and zero otherwise.
Thus we have:
Z Z 1
E [U] = u fU (u)du = un 2u u2(n1) du
0
Z 1
=2n u2n du
0
" 2n+1 #1
u
=2n
2n + 1 0
2n
= .
2n + 1
56
VIII. E
We have that X U (8.5, 10.5), then fX (x) = 1/2 if x [8.5, 10.5] and zero otherwise and we
have:
if x < 8.5;

0,
F X (x) = ,

x8.5
if 8.5 x 10.5;
1,2

if x > 10.5.
Then we have: Pr(loser will not break world record) = Pr X(8) 9.9 = 1 Pr X(8) < 9.9 =

1 F X (9.9)8 = 1 0.78 .
57
Module 2
Parameter Estimation
2.1 Estimation Techniques

Exercise 2.1: [wk05Q2, Solution, Schedule] Assume that X1 , X2 , . . . , Xn is a random sample from a
population with density:
2( x)
, for 0 < x < ;

fX (x|) =

2

0,
otherwise.
Find an estimator for using the method of moments.
Exercise 2.2: [wk05Q5, Solution, Schedule] Consider N independent random variables each having
a binomial distribution with parameters n = 3 and so that:
!
3 k
Pr (Xi = k) = (1 )nk ,
k
for i = 1, 2, . . . , N and k = 0, 1, 2, 3, and zero otherwise. Assume that of these N random variables n0
take the value 0, n1 take the value 1, n2 take the value 2, and n3 take the value 3 with N = n0 +n1 +n2 +n3 .
1. Use maximum likelihood to develop a formula to estimate .

2. Assume that when you go to the races that you always bet on 3 races. You have taken a random
sample of your last 20 visits to the races and find that you had no winning bets on 11 visits, one
winning bet on 7 visits, and two winning bets on 2 visits. Estimate the probability of winning
on any single bet.
Exercise 2.3: [wk05Q6, Solution, Schedule] Assume that we have n independent observations y> =
[y1 , y2 , . . . , yn ], each with the Pareto p.d.f. given by:
A
fYi | (yi |; A) = ,
y+1
i
where 0 < < and 0 < A < yi < , and zero otherwise. You are now told the value of A, leaving
as the only unknown parameter.
1. Explain why the likelihood function L(; y, A) can be written as:

n An
,
Gn(+1)
where G = (y1 y2 . . . yn )1/n is the geometric mean of the observations.
58
2. Explain why we can express the relationship between the posterior distribution, prior distribu-
tion and likelihood function as follows:
(|y; A) fY| (y|; A)().
3. We assume our prior pdf for is such that log() is uniformly distributed, implying:
1
() , 0 < < .

Show that the posterior pdf for is:
(|y; A) n1 ean ,
where a = log(G/A).
4. Explain why the posterior pdf is given by:

(an)n n1 an
(|y; A) = e , 0 < < .
(n)
5. Calculate the Bayes estimator of , b

B .
Exercise 2.4: [wk05Q9, Solution, Schedule] Given that there are n realizations of xi ,where i =
1, 2, . . . , n. We know that xi |p Ber(p) and p U(0, 1).
1. Find the Bayesian estimator for p.
2. Find the Bayesian estimator for p(1 p).
3. Why might we be interested in the Bayesian estimator for p(1 p)? Hint: consider the case
when n is large.
Exercise 2.5: [wk05Q12, Solution, Schedule] Suppose that X follows a geometric distribution, with
probability mass function:
Pr(X = k) = p (1 p)k1 , if k = 1, 2, . . ., and zero otherwise,
and assume a sample of size n.
1. Find the method of moments estimator of p.
2. Find the maximum likelihood estimator of p.
Exercise 2.6: [wk05Q13, Solution, Schedule] The Pareto distribution is often used in economics as
a model for a density function with a slowly decaying tail. Its density is given by:
fX (x|) = x0 x1 , x x0 , > 1,
and zero otherwise. Assume that x0 > 0 is given and that x1 , . . . , xn is a sample from this distribution.
1. Find the method of moments estimate of .
2. Find the maximum likelihood estimator of .
59
2.2 Limit Theorems

An insurance company has a portfolio of 100 insurance contracts. The companys losses on these
contracts are independent and identically distributed. Each loss X has an exponential distribution
with mean 5 000 and each policyholder pays a premium of 5 050. Notice that each policyholder pays
an amount larger than its expected loss. Determine the probability that the aggregate loss of the
insurance company will exceed the total premiums collected. Use the normal approximation (Central
Limit Theorem).
Exercise 2.8: [wk05Q3, Solution, Schedule] Let X1 , X2 , . . . be a sequence of independent random

variables with common mean E[Xk ] = but different variance Var(Xk ) = 2k . Suppose:
n
1X 2
0, as n .
n2 k=1 k
p
Prove X in probability.
Exercise 2.9: [wk05Q4, Solution, Schedule] A drunkard executes a random walk in the following
manner: each minute, he takes a step north or south, with probability 12 each, and his successive step
directions are independent. Each step he takes is of length 50 cm. Use the central limit theorem to
approximate the probability distribution of his location after one hour. Where is he most likely to be?
Exercise 2.10: [wk05Q7, Solution, Schedule] Using moment generating functions:
1. show that as n , p 0 and np , the binomial distribution with parameters n and p

tends to the Poisson distribution.
2. show that as , the gamma distribution with parameters and , properly standardised,
tends to the standard Normal distribution.
Exercise 2.11: [wk05Q8, Solution, Schedule] A random variable X with p.d.f.

1
fX (x) = for < x <
1 + x2
,
is said to have a Cauchy distribution. It is well-known that for Cauchy distribution, its mean does not
exist. Furthermore, suppose X1 , X2 , . . . , Xn are n independent Cauchy random variables, then it can be
shown that the sample mean:
n
1X
Xn = Xk
n k=1
also has a Cauchy distribution.1 Deduce then that from these results, the Cauchy violates the law of
large numbers. Explain why.
Exercise 2.12: [wk05Q10, Solution, Schedule] Let X1 , X2 , . . . be independent random variables with
common density:
fX (x) = x(+1) , for x > 1,
1
Proofs of these results are not expected for this course.
60
where > 0. Define a new sequence of random variables:

1
Yn = X(n) ,
n1/
where X(n) is the highest observation of n i.i.d. r.v. X1 , . . . , Xn .
Show that Yn converges in distribution as n and find the limiting distribution.
Exercise 2.13: [wk05Q11, Solution, Schedule] (Problem from [JR]) Suppose that X1 , X2 , . . . , X20 are
independent random variables with density functions:
fX (x) = 2x, for 0 x 1,
and zero otherwise. Let S = X1 + . . . + X20 . Use the central limit theorem to approximate
Pr (S 10) .
2.3 Evaluating Estimators

Exercise 2.14: [wk06Q1, Solution, Schedule] Let X1 , X2 , . . . , Xn be a random sample from an expo-
nential distribution with:
fX (x|) = ex , x > 0,
.
and zero otherwise, where > 0. Find the value of a so that the interval from 0 to a X provides a
95% confidence interval for the parameter .
Exercise 2.15: [wk06Q2, Solution, Schedule] Consider a random sampling from a normal distri-
bution with mean and variance 2 . Derive a 100 (1 ) % confidence interval of 2 when is
known.
Exercise 2.16: [wk06Q3, Solution, Schedule] This exercise aims to show that if we sample from a
continuous distribution, a pivotal quantity always exists. Let X1 , X2 , . . . , Xn be a random sample from
a continuous distribution fX (x|). Denote the corresponding cumulative distribution function by:
Z x
F X (x|) = fX (z|) dz.

(a) Show that F X (X|) U (0, 1).

Hint: Show that Pr(F X (X|) x) = x using the quantile function (inverse of the c.d.f.), then
explain why x (representing a probability, taking values between 0 and 1) would have this
distribution.
(b) Show that W = log (1/F X (X|)) has an exponential distribution with mean 1. To do so, first
find the c.d.f c.d.f. W.
n
P
(c) From (b), deduce that log (1/F X (Xk |)) has a Gamma distribution. Specify its parameters.
k=1
(d) Use (c) to prove that there will always be a pivotal quantity when sampling from a continuous
distribution.
Exercise 2.17: [wk06Q4, Solution, Schedule] (modified based on a past Institute of Actuaries exam.)
Let X1 , X2 , . . . , Xn denote a random sample of a Gamma(3, ) and X is the sample mean.
61
(a) Describe the distribution of the sample mean X.
(b) Use (a) to construct a lower 95% confidence interval for , of the form (0, U) .
(c) Use (a) to construct an upper 95% confidence interval for , of the form (L, ).
(d) Use (a) to construct a 95% confidence interval for , of the form (L, U) where L and U are not
necessarily equal to those found in (b) and (c).
1. Evaluate the intervals in (b), (c) and (d) in the case for which the total of a random sample of
20 observations yielded a value[(e)] of 20 k=1 xk = 98.2.
P
Exercise 2.18: [wk06Q5, Solution, Schedule] A local health club advertises that its members lose at
least 10 pounds on the average during a 30-day weight loss programme. After receiving a number
of complaints from people who were enticed to join the club, the Better Business Bureau sends out a
representative to the club to check out the claim. The representative sampled the following nine (9)
people who are enrolled in the program:
Person Before-Weight After-Weight Diffrence

1 157 150 7
2 174 167 7
3 198 187 11
4 205 198 7
5 147 146 1
6 165 153 12
7 212 199 13
8 169 171 -2
9 158 156 2
P9
xi 1,585 1,527 58
Pi=1
9 2
i=1 xi 283,457 262,465 590
The representative of Better Business Bureau reported its findings in terms of a confidence interval.
Construct the appropriate 95% confidence interval for the average weight loss for participants in the
programme.
Exercise 2.19: [wk06Q6, Solution, Schedule] (Past Institute of Actuaries Exam Question) Inde-
pendent random samples of size n1 and n2 are taken from the normal populations N 1 , 21 and

N 2 , 22 . Let the sample means be X 1 and X 2 and the sample variances be S 12 and S 22 . You may
assume that X l and S l2 , l = 1, 2 are independent and distributed as follows:
2k (nk 1) S k2
!
X k N k , and 2 (nk 1) for k = 1, 2.
nk 2k
(a) It is required to construct a confidence interval for (1 2 ), the difference between the popu-
lation means.

i. Suppose that 21 and 22 are known. State the distribution of X 1 X 2 and write down a
suitable pivotal quantity together with its sampling distribution. Hence, write down a 95%
confidence interval for (1 2 ).
ii. Suppose that 21 and 22 are unknown but are known to be equal. State the definition of a
tk variable in terms of independent N(0, 1) and 2k variables and use it to develop a suitable
pivotal quantity. Hence, write down a 95% confidence interval for (1 2 ).
62
21
(b) It is required to construct a confidence interval for , the ratio of the population variances.
22
State the definition of an Fk,l variable in terms of independent 2k and 2l variables and use it to
21
develop a suitable pivotal quantity. Hence, obtain a 90% confidence interval for 2 .
2
(c) A regional newspaper included a consumer rights article comparing the cost of shopping in
corner shops and supermarkets. The researchers investigated the price of a standard se-
lection of household goods in a sample of 10 corner shops selected at random from the region,
and in a sample of 10 supermarkets selected at random from the region. The data yielded the
following values:
Sample Mean Sample S.D.
Corner Shops 22.55 1.22
Supermarkets 19.72 0.96
i. Use the result in part (a)(ii) to calculate a 95% confidence interval for (1 2 ), the dif-
ference between the population means (1 = corner shops, 2 = supermarkets).
2
ii. Use your result in part (b) to calculate a 90% confidence interval for 21 , the ratio of
2
the population variances. Use this result to comment briefly on the assumption of equal
variances required for the confidence interval in part (c)(i).
Exercise 2.20: [wk06Q7, Solution, Schedule] (IoA, Subject CT3, April 2005, No.6) In a survey
conducted by a mail order company a random sample of 200 customers yielded 172 who indicated
that they were highly satisfied with the delivery time of their orders.
Calculate an approximate 95% confidence interval for the proportion of the companys customers
who are highly satisfied with delivery times.
Exercise 2.21: [wk06Q8, Solution, Schedule] (IoA, Subject CT3, April 2005, No.8) The distribution
of claim size under a certain class of policy is modelled as a normal random variable, and previous
years records indicate that the standard deviation is 120.
(a) Calculate the width of a 95% confidence interval for the mean claim size if a sample of size 100
is available.
(b) Determine the minimum sample size required to ensure that a 95% confidence interval for the
mean claim size is of width at most 10.
(c) Comment briefly on the comparison of the confidence intervals in (a) and (b) with respect to
widths and sample sizes used.
Exercise 2.22: [wk06Q9, Solution, Schedule] (IoA, Subject CT3, April 2005, No.12 (partial))
1. A random variable Y has a Poisson distribution with parameter but there is a restriction that
zero counts cannot occur. The distribution of Y in this case is referred to as the zero-truncated
Poisson distribution.
(a) Show that the probability function of Y is given by:
y e
pY (y) = , for y = 1, 2, 3, . . . ,
y!(1 e )
and zero otherwise.
63

(b) Show that E[Y] = .
1 e
2. Answer the following.
(a) Let y1 , . . . , yn denote a random sample from the zero-truncated Poisson distribution. Show
that the maximum likelihood estimate of may be determined by the solution to the fol-
lowing equation:
e
y = 0,
1 e
and deduce that the maximum likelihood estimate is the same as the method of moments
estimate.
(b) Obtain an expression for the Cramer-Rao lower bound for the variance of an unbiased
estimator of .
Exercise 2.23: [wk06Q10, Solution, Schedule] (IoA, Subject 101, April 2004, No.12) For the esti-
mation of a bernoulli probability p = Pr(success), a series of n independent trials are performed and
X represents the number of successes observed.
(a) Write down the likelihood function L(p; x) and show that the maximum likelihood estimator
p = X/n.
(MLE) of p is b
(b) Answer the following.
1. Determine the Cramer-Rao lower bound for the estimation of p.

2. Show that the variance of the MLE is equal to the Cramer-Rao lower bound.
3. Write down an approximate sampling distribution for p valid for large n.
(c) In order to develop an approximate 95% confidence interval for p for large n, the following
pivotal quantity is to be used:
pp
b
r N(0, 1).
p(1 p)
n
Assuming that this pivotal quantity is monotonic in p, show that rearrangement of the inequal-
ity:
pp
b
1.96 < r < 1.96
p(1 p)
n
leads to a quadratic inequality in p, and hence determine an approximate 95% confidence inter-
val for p.
(d) A simpler and more widely used approximate confidence interval is obtained by using the fol-
lowing pivotal quantity
bpp
r N(0, 1).
bp(1 b
p)
n
Determine the resulting approximate 95% confidence interval using this.
(e) In two separate applications the following data were observed:
64
(a) 4 successes out of 10 trials

(b) 80 successes out of 200 trials
In each case calculate the two approximate confidence intervals from parts (c) and (d) and
comment briefly on your answers.
Exercise 2.24: [wk06Q11, Solution, Schedule] A random sample of 16 values, x1 , x2 , . . . , x16 , was
drawn from a normal population and gave the following summary statistics:
16
X
xi = 51.2
i=1
16
X
xi2 = 243.19
i=1
Calculate a 95% confidence interval for the population mean.
Exercise 2.25: [wk06Q12, Solution, Schedule] Consider a random sample of size n from a normal
distribution N(, 2 ) and let S 2 denote the sample variance.
(n 1) S 2
1. State the sampling distribution for and specify an approximate sampling distribution
2
for this expression when n is large.
2. For n = 101 calculate an approximate value for the probability that S 2 exceeds 2 by more than
a factor of 10%, i.e. Pr(S 2 > 1.12 ).
Exercise 2.26: [wk06Q13, Solution, Schedule] A group of 500 insurance policies gave rise to a total
of 83 claims during the last year. Assuming a Poisson model for the occurrence of claims, calculate
an approximate 95% confidence interval for , the claim rate per policy per year.
Exercise 2.27: [wk06Q14, Solution, Schedule] Let Xi , i = 1, . . . , n denote a random sample of size
n from a population with a uniform distribution on the interval (0, ). Let X(n) = max{X1 , . . . , Xn } and
define U = (1/)X(n) .
1. Show that U has distribution function:

0, if u < 0;

FU (u) = u , if 0 u 1;

n
1, if u > 1.

2. Because the distribution of U does not depend on , U is a pivotal quantity. Find the 95% lower
confidence bound for .
65
Solutions
Solution 2.1: [wk05Q2, Exercise, Schedule] To find an estimator for using the method of moments,
let E [X] = X. We then have:
Z
X = E [X] = fX (x)dx

Z
2 ( x)
= x dx
2
0
Z
2
= 2 x x2 dx

0
#
2 x2 x3
"
= 2
2 3 0
2 2
3
!
= 2
2 3

= .
3
Hence, the method moments estimate is:
= 3X.
b
Solution 2.2: [wk05Q5, Exercise, Schedule] Consider N independent random variables each having
a binomial distribution with parameters n = 3 and so that Pr (Xi = k) = 3k k (1 )nk , for i =
1, 2, . . . , N and k = 0, 1, 2, 3. Assume that of these N random variables n0 take the value 0, n1 take the
value 1, n2 take the value 2, and n3 take the value 3 with N = n0 + n1 + n2 + n3 .
1. The likelihood function is given by:

Y n
L ; x = fX (xi )
i=1
! !n0 ! !n1 ! !n2 ! !n
3 3 3 2 3 3 3
= (1 )3
(1 )2
(1 ) .
0 1 2 3
The log-likelihood function is given by:
X n
` ; x = log L(; x) = log ( fX (xi ))
i=1
!! ! !! !
3 3
=n0 log + 3 log (1 ) + n1 log + log() + 2 log (1 )
0 1
!! ! !! !
3 3
+ n2 log + 2 log() + log (1 ) + n3 log + 3 log() ,
2 3
* using log(a b) = log(a)
+ log(b)
and log(ac b) = c log(a) + log(b)
Then, take the FOC of ` ; x :

` ; x 3n0 n1 2n1 2n2 n2 3n3
= + + +
(1 ) (1 ) (1 )
n1 + 2n2 + 3n3 3n0 + 2n1 + n2
= .
(1 )
66
Equating this to zero we obtain:

n1 + 2n2 + 3n3 3n0 + 2n1 + n2
= 0,
(1 )
or, equivalently:
(n1 + 2n2 + 3n3 ) (1 ) = (3n0 + 2n1 + n2 ) .
Thus we have the maximum likelihood estimator for is:
(n1 + 2n2 + 3n3 )
=
b
(n1 + 2n2 + 3n3 ) + (3n0 + 2n1 + n2 )
(n1 + 2n2 + 3n3 )
=
(3n0 + 3n1 + 3n2 + 3n3 )
(n1 + 2n2 + 3n3 )
= ,
3N
a
* using: 1a = bc 1/a1
1
= bc 1a 1 = bc 1a = c+b
b
a = b+cb
.
2. We have:
N = 20, n0 = 11, n1 = 7, n2 = 2, n3 = 0.
Thus the ML estimate for is given by:
(n1 + 2n2 + 3n3 )
=
b
3N
7 + 4 11
= =
60 60
= 0.1833.
Thus, the probability of winning any single bet is given by 0.1833.

A
n
Y n
Y
L(; y, A) = fY (yi ) =
i=1 i=1
y+1
i
A
Qn
= Qi=1
n +1
i=1 yi
A
n n
= Qn +1
i=1 yi
n An
= Q
1/n n(+1)
n

y
i=1 i
n An
= n(+1)
G
2. In the lecture we have seen that:
(|y; A) = f|Y (|y; A)

fY| (y|; A)()
=R
f (y|; A)()d
Y|
f (y|; A)()
Y|
=
fY| (y; A)

fY| (y|; A)()
67
* using Bayes formulae: Pr(Ai |B) = PPr(B|A i )Pr(Ai )

j Pr(B|A j )Pr(A j )
, where the set Ai (= ()) i = 1, . . . , n is a
complete partition of the sample space.
** using the law of total probability: Pr(A) = Pr(A|Bi ) Pr(Bi ) if Bi (= ()) i = 1, . . . , n is a
complete partition of the sample space.
*** using that fY| (y; A) is, given the data, a known constant.
3. We have that the posterior density is given by:
(|y; A) = f|Y (|y; A)

fY| (y|; A)()
Y n

=() fY (yi ; A)
i=1
=L(; y, A) ()
1
L(; y, A)

A
n1 n
= n(+1)
G
A n 1
=n1 n
G G
G n 1
=n1 n
A G
G n
n1

A
G n !!
= exp log
n1
A
G
=n1 exp n log
A
= exp (na)
n1
* using independence between all fY| (yi |; A) and fY| (y j |; A) for i , j

* using (G1n ) is a known constant.
4. We have that (|y; A) n1 exp (na) or, equivalently, there exist some constant c <
for which (|y; A) = c n1 exp (na). we need to determine the constant c. We know that
R

(|y; A)d = 1, because otherwise it is not a posterior density.
Given this observation, we are going to compare c n1 exp (na) with the p.d.f. of
X Gamma( x , x ), which is given by:
x x
fX (x) = xx 1 ex x .
( x )
Now, substitute x = , x = n, x = an, and c = (1 x ) = (n)

1
. Then we have the density of a
Gamma(n, an) distribution. Hence, the posterior density is given by:
(an)n n1 an
(|y; A) = e , for 0 < < ,
(n)
and zero otherwise.
68
5. The Bayesian estimator of is the expected value of the posterior. The posterior has a Z Gamma(n, an)
distribution. We have that E [Z] = na
n
. Thus:
h i n 1
B = E (|y; A) =
b = .
na a
Thus the Bayesian estimator of is a1 .
Solution 2.4: [wk05Q9, Exercise, Schedule] Given that there are n realizations of xi ,where i =
1, 2, . . . , n. We know that xi |p Ber(p) and p U(0, 1). We are asked to find the Bayesian esti-
mators for p and p(1 p). Since n random variables are independent, then:
n
Y
f (x1 , x2 , . . . , xn |p) = f (xi |p)
i=1
Pn Pn
=p i=1 xi (1 p)n i=1 xi
Since xi s are independent with random variable p, then

Pn Pn
f (x1 , x2 , . . . , xn , p) = p i=1 xi (1 p)n i=1 xi .
Then we can compute the joint density for xi where i = 1, 2, . . . , n,

Z 1 P
n Pn
f (x1 , x2 , . . . , xn ) = p i=1 xi (1 p)n i=1 xi d p
0
( ni=1 xi + 1)(n ni=1 xi + 1)
P P
= .
(n + 2)
1. Method 1: Hence we can obtain the posterior function:

f (x1 , x2 , . . . , xn , p)
f (p|x1 , x2 , . . . , xn ) =
f (x1 , x2 , . . . , xn )
(n + 2) Pn Pn
= Pn p i=1 xi (1 p)n i=1 xi ,
( i=1 xi + 1)(n + 1 i=1 xi )
Pn
which is the probability density function for:

Beta ni=1 xi + 1 , n + 1 ni=1 xi . Method 2: Observe that the difference between f (x1 , x2 , . . . , xn )
P P
and the p.d.f. in of a Beta distribution are proportional to each other and use this to find the
distribution of f (p|x1 , x2 , . . . , xn ).
Hence, we have f (p|x1 , x2 , . . . , xn ) fY (x),
where Y Beta ni=1 xi + 1 , n + 1 ni=1 xi .
P P
The Bayesian estimator for p will thus be:
xi + 1
Pn
p = E p|X = i=1
B
.

n+2
b
(See Formulae and Tables page 13).
69
2. Now we wish to find a Bayesian estimator for p(1 p). Then using the similar idea:
B
p)) =E p(1 p)|X

(p(1
[
Z 1
= p(1 p) f (p|x1 , x2 , . . . , xn )d p
0
(n + 2)
Z 1 Pn Pn
= Pn p1+ i=1 xi (1 p)n+1 i=1 xi d p
( i=1 xi + 1)(n + 1 i=1 xi ) 0
Pn
(n + 2) ( ni=1 xi + 2)(n ni=1 xi + 2)

P P

= Pn
( i=1 xi + 1)(n + 1 ni=1 xi ) (n + 4)
P
(n + 2)
= Pn
( i=1 xi + 1)(n + 1 ni=1 xi )
P
(( ni=1 xi + 1) ( ni=1 xi + 1)) ((n ni=1 xi + 1) (n ni=1 xi + 1))
P P P P
(n + 3) (n + 2) (n + 2)
xi + 1)(n + 1 i=1 xi )
Pn Pn
(
= i=1 .
(n + 3)(n + 2)
* using Beta function: B(, ) = ()()

R1
= 0 x1 (1 x)1 dx, where = ni=1 xi + 2,
P
(+)
= n + ni=1 xi + 2, + = n + 4.
P
** using Gamma function: () = ( 1) ( 1).
Alternatively, using first to moments of the beta distribution (see Formulae and Tables page 13)
we have:
B
p)) = E p(1 p)|X

(p(1
[
h i
= E p|X E p2 |X

xi + 1 (a + b) (a + 2)
Pn

= i=1
n+2 (a) (a + b + 2)
(a + 1) a
Pn
xi 1
= i=1
n2 (a + b + 1)(a + b)
( i=1 xi + 1)(n + 1 ni=1 xi )
Pn P
= ,
(n + 3)(n + 2)
* where a = ni=1 xi + 1 and b = n + 1 ni=1 xi
P P
3. We are interested in the Bayesian estimator of p(1 p), since np(1 p) is the variance of the
binomial distribution (with n a known constant) and we can use this for the normal approxima-
tion.
Solution 2.5: [wk05Q12, Exercise, Schedule] Note that X can be interpreted as a geometric random
variable where k is the total number of trials. Here E [X] = 1p .
1. The method of moments estimator is given by:

1
X =
p
1
p =
e
X
n
= n
P
Xi
i=1
70
2. The likelihood function is:

n
Y n
Y
L(p; x) = fX (xi ) = p(1 p) xi 1
i=1 i=1
Pn
= pn (1 p) i=1 xi n .
The log-likelihood function is:

n
n
X X
`(p; x) = log L(p; x) = log( fX (xi )) = n log(p) +

xi n log(1 p).
i=1 i=1
Take the FOC of `(p; x) wrt p and equate equal to zero:

Pn
n xi n
` (p) = i=1
0
= 0.
p 1 p
The we obtain the Maximum Likelihood estimator for p:
n n
p = Pn = Pn ,
i=1 Xi n + n
b
i=1 Xi
* using: a
1a
= b
c
1
1/a1
= b
c
1
a
1= c
b
1
a
= c+b
b
a= b
b+c
.
Solution 2.6: [wk05Q13, Exercise, Schedule] For the Pareto distribution with parameters x0 and
we have the following p.d.f.:
f (x) = (x0 ) x1 , x x0 , > 1,
and zero otherwise. The expected value of the random variable X is then given by:
Z Z
E [X] = x fX (x)dx = x (x0 ) x1 dx
x0
R
x
= (x0 ) x dx 0

1
#"
x
= (x0 )
1 x0

= x0
1

= x0 .
1

1. Given x0 , we have E [X] = x,
1 0
thus:

x0 =X
1
x0 =X ( 1)
x0 =X X

X = X x0
X
=
b .
X x0
Thus the method of moment estimator of is X

Xx0
.
71

n
Y n
Y
L(; x) = fX (xi ) = (x0 ) xi1
i=1 i=1
n
Y
= n (x0 )n xi1 .
i=1
The log-likelihood function is given by:

n
X
`(; x) = log(L(; x)) = log( fX (xi ))
i=1
n
X
=n log() + n log (x0 ) ( + 1) log(xi ).
i=1
Take the FOC of `(; x) and equate equal to zero:

n
`() n X
= + n log (x0 ) log(xi ) = 0
i=1
n
n X
= n log (x0 ) + log(xi )
i=1
n
b = n
.
n log (x0 ) + log(xi )
P
i=1
Thus, the maximum likelihood estimator for is given by n

n
P .
n log(x0 )+ log(xi )
i=1
Solution 2.7: [wk05Q1, Exercise, Schedule] We are given that X Exp(1/5000). Thus, E [X] =
5000 and Var (X) = (5000)2 . Let S = X1 + . . . + X100 . Then E [S ] = 100 (5000) = 500, 000 and
Var (S ) = 100 (5000)2 .Thus, using the central limit theorem, we have:
!
S E (S ) 100 (50)
Pr (S > 100 (5050)) = Pr >
Var (S ) 10 (5000)
Pr (Z > 0.10) = 1 0.5398 = 0.4602.
p
Solution 2.8: [wk05Q3, Exercise, Schedule] To prove X n in probability, we show that if we
take any > 0, we must have:

Pr X n > 0, as n
or, equivalently;
lim Pr X n > = 0.
n
First, note that we have:

h i 1 X n
E Xn = and Var X n = 2 2 .
n k=1 k
Applying the Chebyshevs inequality:
1 1 Xn
Pr X n > 2 2 2 .
n k=1 k
72
And take the limits on both sides:

n
1 1 X 2
lim Pr X n > lim 2 2
n n n k=1 k
n
1 1 X 2
= 2 lim 2 = 0.
n n k=1 k
| {z }
=0
Thus, the result follows.
Solution 2.9: [wk05Q4, Exercise, Schedule] Let L be the location after one hour (or 60 minutes).
Therefore:
L = X1 + . . . + X60 ,
where (
50 cm, w.p. 21
Xk =
50 cm, w.p. 12 ,
so that E [Xk ] = 0 and Var (Xk ) = 2500.
Therefore,
E [S ] = 0 and Var (S ) = 60 (2500) = 150000.
Thus, using the central limit theorem, we have:
! !
L E [L] x x
Pr (L x) = Pr Pr Z .
Var (L) 150000 100 15
In other words,
L N (0, 150000)
approximately. The mean of a normal is also the mode, therefore its most likely position after one
hour is 0, the point where he started with.
Solution 2.10: [wk05Q7, Exercise, Schedule] We use moment generating function to show that:
1. The binomial tends to the Poisson: Let X Binomial(n, p). Its m.g.f. is therefore:
n
MX (t) = 1 p + pet
let np = so that p = /n
n
= 1 + et
n n !
n
et 1
= 1+
n
and by taking limit on both sides, we have:
!n
et 1
lim MX (t) = lim 1 + = exp et 1 ,

n n n
which is the moment generating function of a Poisson with mean .
2. The gamma, properly standardized, tends to Normal: Let X Gamma(, ) so that its density
is of the form:
1 x
f (x) = x e , for x 0,
()
73
and zero otherwise, and its m.g.f. is:

!

MX (t) = .
t
Its mean and variance are, respectively, / and /2 . These results have been derived in lecture
week 2. Consider the standardized Gamma random variable:
X E (X) X / X X
Y= = p = =
Var (X) /2
Its moment generating function is:
t
!
X
t t t
MY (t) = e E e =e
MX

= e t = e t e log(1(t/ ))

t/
!!
1 2
= exp t t/ t/ + R
2
here R is the Taylors series remainder term
!
12 0
= exp t + R ,
2

where R involves powers of 1/ .. Thus in the limit, MY (t) exp 21 t2 as .
0
Solution 2.11: [wk05Q8, Exercise, Schedule] If the law of large numbers were to hold here, it would
have had the sample mean X approaching the mean of X, which does not exist in this case. At first
glance therefore it would seem not a violation. But, in fact, it is, because the assumption of finite
mean does not hold for Cauchy and therefore the law of large numbers cannot hold.
Solution 2.12: [wk05Q10, Exercise, Schedule] The common distribution function is given by:
Z x
F X (x) = u(+1) du = u 1x = 1 x , if x > 1,

1
and zero otherwise. The distribution function of Yn will be:

!
1
FYn (x) = Pr (Yn x) = Pr X(n) x
n1/
!n
n
1/ x
= Pr X(n) n x = 1 n x
1/
= 1 ,
n
X(n)
if x > 1 and zero otherwise. Notice that whereas x > 1, due to the transformation Yn = n1/
y > 0, i.e.,
when is close to zero n1/ is large! Taking the limit as n , we have:
!n
x
lim FYn (x) = lim 1 = exp x .

n n n
Thus, limit exists and therefore converges in distribution. The limiting distribution is:
FYn (y) = exp y , for y > 0,

74
and zero otherwise, the corresponding density is:

FYn (y)
fYn (y) = = y(1) exp y , if y > 0,

y
and zero otherwise. You can prove that this is a legitimate
R density by fYn (y) 0 for all y, because
> 0, y+1 0 and exp (y ) 0 and FYn () = fYn (y)dy = exp(0) = 1.
Solution 2.13: [wk05Q11, Exercise, Schedule] The mean and the variance of S are respectively:
40 10
E [S ] = and Var (S ) = .
3 9
Thus, using the central limit theorem, we have:
!
S E [S ] 10 (40/3)
Pr (S 10) = Pr
Var (S ) 10/9

Pr Z 10 = Pr (Z 3.16) = 0.0008.
Solution 2.14: [wk06Q1, Exercise, Schedule] We wish to find a so that:

!
a
Pr 0 < < = 0.95,
X
or, equivalently, a
Pr X < = 0.95,

since > 0, X > 0. Note that X Exp() so that:

MX (t) = ,
t
and the m.g.f. of the sample mean, X, is:
MX (t) =MPni=1 Xi /n (t) = MPni=1 Xi (t/n) = MXi (t/n) n

!n
n n
= = ,
t/n n t
which is the m.g.f. of a Gamma(n, n). Therefore,
!
1
2nX Gamma n, = 2 (2n) ,
2
which is therefore free of the parameter . Thus, using this as a pivot, we have:

Pr 2nX < 21 (2n) = 1 ,
or, equivalently,
21 (2n)
!
Pr < = 1 .
2nX
For = 0.05, the required constant is:
20.95 (2n)
a= ,
2n
where 20.95 (2n) denotes the 95th quantile of a chi-squared distribution with 2n degrees of freedom.
75

Solution 2.15: [wk06Q2, Exercise, Schedule] Let X1 , . . . , Xn be a random sample from N , 2 ,
and Zk a standard normally distributed. If is known, then it is known that:
X 2
k
= Zk2 2 (1) ,

so that: n n
X Xk 2 X
= Zk2 2 (n) ,
k=1
k=1
and to construct a 100 (1 ) % confidence interval for 2 , we define 2/2 (n) and 21/2 (n) to be the
(/2)th and (1 /2)th quantiles respectively of a chi-squared distribution with n degrees of freedom.
Using the above as a pivot quantity, we have:
k=1 (Xk )
Pn 2 !
Pr /2 (n) <
2
< 1/2 (n) = 1 ,
2
2
which implies that:

k=1 (Xk )2 )
Pn Pn
2
(Xk
< 2 < k=1 2 = 1 .

Pr
1/2 (n)
2
/2 (n)
Thus we have a 100 (1 ) % confidence interval estimate for 2 when is known:
k=1 (Xk ) k=1 (Xk )

Pn 2 Pn 2
< 2
< .
21/2 (n) 2/2 (n)
Solution 2.16: [wk06Q3, Exercise, Schedule] We have a random sample from a continuous distribu-
tion.
1. To prove that the c.d.f., when viewed as a random variable, has a uniform distribution, we have:

Pr (F X (X) x) = Pr X F X1 (x)

= F X F X1 (x) = x.
We know that this is the c.d.f. of a Uniform(0, 1) random variable, because x represents proba-
bility, which lays between 0 and 1 and the p.d.f. of the probability is uniformly distributed (e.g.
the probability of the probability occurring is equal for all probabilities between 0 and 1).
2. Let W = log (1/F X (X)) . Then we have:
FW (x) = Pr (W x) = Pr log (1/F X (X)) x

= Pr log (F X (X)) x = Pr log (F X (X)) x

= Pr log (F X (X)) x = 1 Pr log (F X (X)) < x

= 1 Pr F X (X) ex = 1 ex ,

so that its density is (take the derivative of FW (x) w.r.t. x):

FW (x)
fW (x) = = ex ,
x
which implies W Exp(1), standard exponential.
76
3. Let Wk = log (1/F X (Xk )) Exp(1). Then, the m.g.f. of Wk , with = 1 is given by:
t 1
MWk (t) = 1

Using the m.g.f. technique we have, using the properties of m.g.f. (week 1):
n
X
Y= Wk
k=1
t n
MY (t) =MW1 +...+Wn (t) = MWk (t) n = 1

Y Gamma (n, 1) .
It has a gamma distribution with parameters = n and = 1.
4. Using the properties of m.g.f. and the result from (c), we have:
n
X
2Y =2 Wk
k=1
!n 2n
2t t 2
M2Y (t) =MW1 +...+Wn (2t) = MWk (2t) = 1 = 1
n

2
n !
X 2n 1
2Y 2 Wk Gamma , = 2 (2n) .
k=1
2 2
Pn
Thus, you can always choose 2 k=1 Wk as a pivot because its distribution is free of any param-
eter.
Solution 2.17: [wk06Q4, Exercise, Schedule] Suppose X1 , X2 , . . . , Xn is a random sample from a

Gamma(3, ) so that its m.g.f.:
3
MXi (t) = .
t
1. Use the MGF technique. We have that the m.g.f. of the sample mean X can be written as:
MX (t) =MPni=1 Xi /n (t) = MPni=1 Xi (t/n) = MXi (t/n) n

!3n
n 3n
= = ,
t/n n t
which has the form of the m.g.f. of a Gamma(3n, n) distribution.
2. Note that the m.g.f. of the random variable Y = 2nX is:

!6n/2
h i 1
MY (t) = E e Yt
= MX (2nt) = ,
1 2t
which is the m.g.f. of a 2 (6n). Thus, we can use it as a pivot to construct confidence interval.
To construct a lower 95% confidence interval for , we note:

Pr 0 < 2nX < 20.95 (6n) = 0.95,
77
so that equivalently:
20.95 (6n)
!
Pr 0 < < = 0.95.
2nX
Therefore:
20.95 (6n)
!
0, =U ,
2nX
is the required confidence interval.
3. Similar to (b) above, it can be shown that:

Pr 20.05 (6n) < 2nX < = 0.95,
so that equivalently:
20.05 (6n)
!
Pr < < = 0.95,
2nX
and, hence:
20.05 (6n)
!
L= , ,
2nX
provides an upper confidence interval.
4. Similar to (b) and (c) above, the following:
20.025 (6n) 20.975 (6n)

!
, ,
2nX 2nX
provides the required confidence interval.
5. Using 20k=1 xk = 98.2, we have the following 95% confidence intervals for :
P
20.95 (120)
! !
146.57
Lower Tail: 0, = 0, = (0, 0.7463)
2 (98.2) 196.4
20.05 (120)
! !
95.70
Upper Tail: , = , = (0.4873, )
2 (98.2) 196.4
20.025 (120) 20.975 (120)

! !
91.58 152.21
Both Tails: , = , = (0.4663, 0.7750) .
2 (98.2) 2 (98.2) 196.4 196.4
Note: the values 20.95 (120) = 146.57, 20.05 (120) = 95.70, 20.975 (120) = 152.21, and 20.025 (120) =
91.58 are computed using R or Excel (using: =chiinv(q,df)). Formulae and Tables only pro-
vides percentage points of the chi-squared distribution until 100 degrees of freedom (page 169).
Alternatively: use the approximate distribution of a chi-squared distributed for large n. Let
Y 2 (n), then we know Y = ni=1 Zi2 , where Zi are i.i.d. standard normal random variables.
P
Applying the Law of Large Numbers (see week 5) we have: Y N(n, 2n). Using this we
have that (n) = n + 2nz , thus: 20.95 (120) 145.49 (z0.95 = 1.6449), 20.05 (120) 94.52
2
(z0.05 = 1.6449), 20.975 (120) 150.36 (z0.975 = 1.96), and 20.025 (120) 89.64 (z0.025 = 1.96).
Solution 2.18: [wk06Q5, Exercise, Schedule] The weight loss is clearly:
D = After-Weight Before-Weight .

78
A negative difference will mean a weight loss and a positive difference, a gain in loss. The sample
mean and standard deviation of the difference is
v
t n
1 X
d = 6.4444 sD = xi2 n x2 = 5.1988.

and
n 1 i=1
The required 95% confidence interval is therefore given by:
. .
d t1/2,n1 sD n = 6.4444 (2.306) 5.1988 9
= (10.44056322, 2.448236783).
This result may differ slightly because of rounding.
1. (a) We have that the difference in sample mean, given known population variances, is given
by:
21 22
!
(X 1 X 2 ) N 1 2 , + ,
n1 n2
note that the samples are independent, thus Cov(X 1 , X 2 ) = 0.
The pivotal quantity is then:
(X 1 X 2 ) (1 2 )
q 2 N (0, 1) .
1 22
n1
+ n2
Finally, using the pivotal quantity we have that the 95% confidence interval is given by:
s s
21 22 21 22
(x1 x2 ) + z10.025 < (1 2 ) < (x1 x2 ) + + z10.025 ,
n1 n2 n1 n2
where z10.025 is the 0.975 quantile of a standard normal random variable.
(b) Now, we have that the difference in sample mean, given equal, but unknown population
variance, is given by:
(X 1 X 2 )(1 2 )

1/n1 +1/n2
tk = r
kS 2p
2
k
Z
= ,
Y/k
(n 1)S 2 +(n 1)S 2
where S p = 1 n1 +n 1 2
2 2
2
, Z N(0, 1), and Y 2n1 +n2 2 thus k = n1 + n2 2. The
pivotal quantity is given by:
(X 1 X 2 )(1 2 )

1/n1 +1/n2
q tk .
S 2p
2
The 95% confidence interval is then given by:

r r
1 1 1 1
(x1 x2 ) t10.025,k s p + < (1 2 ) < (x1 x2 ) + t10.025,k s p + ,
n1 n2 n1 n2
where t10.025,k is the 0.975 quantile of a student-t random variable with parameter k =
n1 n2 2.
79
(n1 1)S 12 (n2 1)S 22

2. We have that 21
2 (n1 1) and 22
2 (n2 1). Thus we have:
(n1 1)S 12 /21

n1 1
Fk,l =
(n2 1)S 22 /22
n2 1
S 12 /21
= 2 2
S 2 /2
Yk
= ,
Yl
where Yk 2 (n1 1), hence k = n1 1 and Yl 2 (n2 1), hence l = n2 1. The pivotal
quantity is given by:
22 S 12
Fk,l = .
21 S 22
Thus the 90% confidence interval is given by:
s21 1 21 s21 1
< 2 < 2 ,
s22 F
0.95 (n1 1, n2 1) 2 s2 F0.05 (n1 1, n2 1)
where F (n1 1, n2 1) is the th quantile of a F-distributed random variable with parameters

n1 1 and n2 1.
3. (a) We use here the t-distribution, because n1 = 10 and n2 = 10, hence, we do not have a large
sample and cannot use that tn N(0, 1) as n . We have S 12 = (1.22)2 = 1.4884 and
(n 1)S 2 +(n 1)S 2
S 22 = (0.96)2 = 0.9216, hence S p = 1 n1 n 1 2
2 2
2
= (101)1.4884+(101)0.9216
10+102
= 1.205.
Thus, the 95% confidence interval is given by:
r r
1 1 1 1
(x1 x2 ) t10.025,k s p + <(1 2 ) < (x1 x2 ) + t10.025,k s p +
n1 n2 n1 n2
r r
2 2
(2.83) t10.025,18 1.205 <(1 2 ) < (2.83) + t10.025,18 1.205
10 10
1.798582315 <(1 2 ) < 3.861417685,
using t10.025,18 = 2.101 (see table Formulae and Tables page 163). Note, when n
t10.025,n = z10.025 = 1.96, we see that the small sample size makes a difference.
(b) The 90% confidence interval for the ratio of the population variance is given by:
s21 1 21 s21 1
< < 2
2 F
s2 0.95 (n1 1, n2 1) 2 s2 F0.05 (n1 1, n2 1)
2
1.4884 1 2 1.4884
< 12 < 3.179
0.9216 3.179 2 0.9216
21
0.508027 < 2 < 5.13414
2
using F0.05 (9, 9) = F0.951(9,9) = 3.179

1
and F0.95 (9, 9) = 3.179 (see table Formulae and Tables
page 172).
We have that one is in the 90% confidence interval for the ratio of the variance, thus we
cannot reject that the variance of the samples is equal with probability 90%. Therefore,
the assumption of equal variances is a good assumption.
80
Solution 2.20: [wk06Q7, Exercise, Schedule] (See Q9 before doing this) X, the number of customers
who indicated high satisfaction; X Binomial(200, p).
X
p = = 172
Estimator of the parameter p : b = 43 = 0.86. Then:
n 200 50
0.86 p
Z= r N(0, 1).
0.86 (1 0.86)
200
An approximate 95% confidence interval for p is:
r r
0.86 (1 0.86) 0.86 (1 0.86)
, 0.86 + z10.025

0.86 z10.025
200 200
r r
0.86 (1 0.86) 0.86 (1 0.86)
= 0.86 1.96 , 0.86 + 1.96

200 200

= 0.811 910 05, 0.908 089 95 .
Solution 2.21: [wk06Q8, Exercise, Schedule] The distribution of claim size under a certain class of
policy is modelled as a normal random variable, and previous years records indicate that the standard
deviation is 120.
n
P
Xi
1. Let X = i=1
n
, an unbiased estimator of . We have:
dist
X N(, 2X = 2 /n).
To derive a 95% confidence interval for :

Use as Pivot function:
X X X dist
Z= = N(0, 1).
X / n
In general,

!
X z1/2 , X + z1/2 ,
n n
is an approximate 100 (1 ) % confidence interval for . Hence the width of a 95% confidence
interval for is 2z10.025 n . For this problem,
120
width = 2(1.96) = 47. 04.
10
2. We want:

2z10.025 10,
n
hence:
120
2(1.96) 10,
n
2
and n 2(1.96) 120
10
= 2212. 761 6. The minimum sample size required is 2213.
3. The smaller the width of the confidence interval, the larger is the required sample size.
81
1. A random variable Y has a Poisson distribution with parameter but there is a restriction that
zero counts cannot occur. The distribution of Y in this case is referred to as the zero-truncated
Poisson distribution.
(a) Let X be the not-truncated random variable of Y. Hence, X has a Poisson distribution.
Note that:
Pr(X = x)
Pr(Y = y) = Pr(X = x|X > 0) =
Pr(X 1)
Pr(X)
=
1 Pr(X = 0)
y e
= , for y = 1, 2, 3, . . . ,
y!(1 e )
and zero otherwise.
(b)

X yy e
E[Y] = E[X|X > 0] =
y=1
y!(1 e )

1 X y e
=
1 e y=1 (y 1)!

1 X y+1 e
=
1 e y=0 (y)!

X y e
=
1 e y=0 (y)!

=
1 e

P y e
* using (y)!
is the sum of all the probability mass function of a Poisson random variable
y=0
with parameter and thus equals one.
Alternatively, one could use:
E[X] = E[X|X > 0] Pr(X > 0) + E[X|X 0] Pr(X 0).
| {z }
=0
Using this we have:

E[X]
E[X|X > 0] =
Pr(X > 0)

=
1 Pr(X = 0)

= ,
1 e
* using E[X] = and ** using Pr(X = 0) = e 0 /0! = e .
2. (a) The likelihood function is:
n
P
n n yi
Y Y e
yi
e
n i=1
L(; y) = fY (yi ) = = ,
yi !(1 e ) (1 e ) n
n Q
i=1 i=1 yi !
i=1
82
and the log-likelihood function is:

n n
n
X X Y
` ; y = log( fY (yi )) = n n ln(1 e ) +

yi ln ln yi ! .
i=1 i=1 i=1
To find the maximum point, we set the derivative of the log-likelihood function equal to
zero: n
P
`(; y) yi
ne i=1
= n + = 0.
d (1 e )
Equivalently,
e Y
1 + = 0,
(1 e )

or:
e
Y = 0.
1 e
Also, from the method of moments:

Y =E [Y] =
=0
1 e
e + e
0 =Y
1 e
e
0 =Y ,
1 e
* using result in Q9(a)2. Hence the maximum likelihood estimate is the same as the
method of moments estimate.
(b)
n
P
`(; y) yi
ne i=1
=n +

(1 e )
n
P
`(; y)
2 2
! yi
(1 e )ne ne i=1
=
2 (1 e )2 2
Pn
yi
ne
!
i=1
= 2
(1 e )2
Pn
`(; y)
2
! yi
=E
ne i=1
2

E
2
(1 e )
2
ne n 1e
!
=
(1 e )2 2

!
ne n
=

(1 e ) 2 (1 e )
ne n(1 e )
=
(1 e )2
ne n + ne )
=
(1 e )2
83
(1 e )2
CRLB =
ne n + ne )
Solution 2.23: [wk06Q10, Exercise, Schedule] We observe Yi Ber(p) the number of successes. We
know that ni=1 Yi Bin(n, p). We have that X is the number of successes.
P
1. We have that:
n
Y
L(p, y) = pyi (1 p)1yi = p x (1 p)nx = L(p, x).
i=1
The maximum likelihood estimator can be found using the log-likelihood:

n
X
`(p, y) = yi log(p) + (1 yi ) log(1 p) = X log(p) + (n X) log(1 p) = `(p, x),
i=1
and than the the derivative of `(p, x) with respect to p and equate that equal to zero:
`(p, x) X n X
0= =
p p 1 p
(1 p)X (n X)p
=0
p(1 p) p(1 p)
(1 p)X =(n X)p
X =np
b p =X/n.
2. (a) The Cramer-Rao lower bound is given by:

1
Var(T ) ,
nI f ? (p)
where
log( f (x, p)) 2
!
I f ? (p) =E
p

log( f (x, p))

" 2 !#
=E
p2
log(p x (1 p)1x )
" 2 !#
=E
p2
x log(p) + (1 x) log(1 p)
" 2 !#
=E
p2
x/p (1 x)/(1 p)
" !#
=E
p
" !#
x 1x
=E
p2 (1 p)2
!
p (1 p)
= 2
p (1 p)2
1 1 1
= + = ,
p 1 p p(1 p)
using E [X r ] = p. Thus we have the CRLB:
p(1 p)
Var(T ) .
n
84
(b) The variance of the Maximum Likelihood estimator is given by:

p(1 p)
Var(X/n) = Var(X)/n2 = np(1 p)/n2 = .
n
Thus the MLE is on the CRLB.
(c) The approximate sampling distribution for b
p is obtained using the law of large numbers.
p = X/n N(p, p(1 p)/n).

b
3. We have:
p p
1.96 < p <1.96
b
p (1 p)/n

p p
<1.96
b
p
p (1 p)/n
p p 2

<1.962
b
p (1 p)/n
p (1 p)
p2 + p2 2 b
b p p <1.962
n
(1 + 1.962 /n) p2 (2 b
p + 1.962 /n) p + b
p2 <0
p + 1.962 /n)2 4 (1 + 1.962 /n) b
p
p + 1.962 /n)
(2 b (2 b p2
<p <
2 (1 + 1.962 /n) 2 (1 + 1.962 /n)
p + 1.962 /n)2 4 (1 + 1.962 /n) b
p
p + 1.962 /n)
(2 b (2 b p2
+ ,
2 (1 + 1.962 /n) 2 (1 + 1.962 /n)

where the last step is derived using the abc-formula: ax2 +bx+c = 0 d = b2 4ac, x = b d
2a
.
p
p(1 b
4. Using b p) p(1 p) as n we can approximate the approximated pivotal quantity
by:
p p
b
p N(0, 1).
bp (1 b
p)/n
Using this approximation the 95% confidence interval becomes:
p p
z10.025 < p <z10.975
b
p (1 b
b p)/n
p p
1.96 < p <1.96
b
p (1 b
b p)/n
p p
1.96 b p (1 bp)/n < b p p <1.96 b p (1 bp)/n
p p
1.96 b p (1 bp)/n b p < p <1.96 b p (1 bp)/n b
p
p p
p 1.96 b
b p (1 bp)/n < p <bp + 1.96 b p (1 b
p)/n.
5. Now, we are applying the two confidence intervals to data:
(a) We have n = 10, X = 4 and b p = X/n = 0.4. Using this the confidence interval in c) is
given by (0.168177581, 0.687330453) and in d) by (0.096358106, 0.703641894).
85
(b) We have n = 200, X = 80 and bp = X/n = 0.4. Using this the confidence interval in c) is
given by (0.33460464, 0.469164561) and in d) by (0.332103608, 0.467896392).
p
We observe that for large n indeed the convergence b p(1 b
p) p(1 p) is is a good approx-
imation, but for small n (i.e., equal to 10) this does not hold. Therefore, the approximation of
the confidence interval in d) is substantial different than the confidence interval in c).
Note that we use in cases the law of large numbers for the pivotal quantity. The Law of Large
Number is a good approximation if we have a large sum, which is not the case for n = 10.
Therefore, it would be better to use the exact binomial test if n is small and not the normal
approximation. Hence, if n is large, both the Law of Large Numbers and the convergence
p
p(1 b
b p) p(1 p) can be used for a good approximation of the confidence interval for p, but
if n is small, one should use the exact Binomial pivotal quantity.
Solution 2.24: [wk06Q11, Exercise, Schedule] Note that the sample size is equal to 16 (i.e., n = 16),
thus we have a small sample size and have to use the student-t distribution for the population mean.
The 95% (i.e., = 0.05) confidence interval for the population mean is given by:
s s
x t1/2,n1 < < x t1/2,n1
n n
s s
P16 P16 2 2 P16 P16 2 2
i=1 xi i=1 xi n x 1 i=1 xi i=1 xi n x 1
t1/2,n1 < < t1/2,n1
n n1 n n n1 n
r r
51.2 243.19 163.84 1 51.2 243.19 163.84 1
t10.025,15 < < + t10.025,15
16 15 16 16 15 16
r r
5.29 5.29
3.2 2.131 < < 3.2 + 2.131
16 16
1.974675 < < 4.425325,
using t10.025,15 = 2.131 (see table Formulae and Tables page 163).
1. We have that:
(n 1)S 2
2 (n 1).
2
Moreover, we know that 2 (n 1) = n1 2
P
i=1 Zi N(n 1, 2 (n 1)) as n due to the Law
of Large numbers. Thus, as n we approximately have:
(n 1)S 2 /2 (n 1)
N(0, 1).
2 (n 1)
86
2. Using n = 101, we need to find:

Pr S 2 > 1.12 = Pr S 2 /2 > 1.1
S 2 /2 (n 1) (n 1) 1.1 (n 1) (n 1)
!
= Pr >
2(n 1) 2(n 1)
!
0.1 (n 1)
= Pr Z >
2(n 1)
!
n1
=1 0.1
2(n 1)
!
100
=1 0.1
! 200
1
=1 = 1 0.76025 = 0.23975.
2
Solution 2.26: [wk06Q13, Exercise, Schedule] We have that Xi POI() i.i.d. for i = 1, . . . , 500.
Using the moment generating technique we have MP500 i=1 Xi
(t) = MX500
i
(t) = exp( (exp(t) 1))500 =
exp(500 (exp(t) 1)). Thus we have i=1 Xi = X POI(500).
P500
Due to the Law of Large numbers, we have that 500i=1 Xi = X is approximately normally distributed
P
with mean 500 and variance 500. Thus:
X
500 N(0, 1).
500
We have:
83
z0.025 < 500 < z0.975
500

83/ 500 500
1.96 < < 1.96

83
500 <1.96

500
83 2
83
+ 500 2 2 500 <1.962
500 500
832
83
+ 500 2 (2 500 + 1.962 ) <0
500 500
0.133923 < < 0.205761,

where the last step is derived using the abc-formula, i.e., ax2 + bx + c = 0 d = b2 4ac, x = b d
2a
.
Solution 2.27: [wk06Q14, Exercise, Schedule] We have that Xi UNIF(0, ) i.i.d. for i = 1, . . . , n.
We denoted U = (1/)X(n) .
1. We know (week 5) that that the cumulative distribution function of the maximum F X(n) =
F X (x(n) ) n and we have F X (x) = x , if 0 < x < . Thus we have:

if x(n) < 0;

0,
x(n) n
F X x(n) = , if 0 x(n) ;

if x(n) > .

1,

87
Hence, using the transformation U = (1/)X(n) we have:
if u < 0;

0,
FU (u) = n
,

(u) if 0 u 1;

if u > 1.

1,
2. We use U as a pivotal quantity. To find the confidence interval for we have:
Pr (q1 ) =0.95

Pr q1 X(n) /U =0.95

!
q1
Pr 1/U =0.95
X(n)
!
X(n)
Pr U =0.95
q1
!
X(n)
FU =0.95
q1
X(n) n
!
=0.95
q1
X(n)
=0.951/n
q1
q1 =X(n) 0.951/n

Pr X(n) 0.951/n =0.95,
* using U = (1/)X(n) = X(n) /U. Thus, the 95% lower confidence interval for is
(0, X(n) 0.951/n ).
88
Module 3
Hypothesis Test
3.1 Statistical test procedure

Exercise 3.1: [wk07Q1, Solution, Schedule] Explain carefully the distinction between each of the
following pairs of terms:
1. null and alternative hypotheses;
2. one-tailed and two-tailed hypotheses;
3. simple and composite hypotheses;
4. Type I and Type II errors;
Exercise 3.2: [wk07Q2, Solution, Schedule] Let X1 , X2 , . . . , X10 be a random sample of size 10 from
a Poisson distribution with mean . Consider the critical region C defined by:
10

X
C= , , . . . , .

(x x x ) : x 3

1 2 10 k

k=1
1. Show that C is a best critical region for testing H0 : = 0.1 against Ha : = 0.5.
2. Determine the level of significance for this test.
Exercise 3.3: [wk07Q3, Solution, Schedule] Let X1 , X2 , . . . , Xn be a random sample from the density
function: !
1 1
fX (x|) = exp (x ) . 2
2 2
At a level of significance , find the best critical region (or most powerful test) for testing the simple
null H0 : = 0 against the simple alternative Ha : = 1.
Exercise 3.4: [wk07Q4, Solution, Schedule] Let X1 , X2 , . . . , Xn be a random sample from a Poisson()
distribution. In testing the simple null H0 : = 0 against the simple alternative Ha : = 1 , where
1 > 0 :
1. Find the best critical region (or most powerful test).
2. Determine the distribution of the test statistic under the null hypothesis.
89
Exercise 3.5: [wk07Q5, Solution, Schedule] Past Institute exam
1. A manufacturing company produces screws of a particular size which are put into boxes of
150. On a particular day a random sample of such boxes is taken from each of the morning
and afternoon production runs. The number of defective screws found in each sampled box are
given in the following table:
Morning 28 17 18 16 20 12 11 10 18 17 20 25
Afternoon 19 15 22 21 9 14 17 13 22 9
Table 3.1: Number of defectives per box
(a) Test for a difference between the mean number of defectives produced in the morning and
afternoon (you may assume that the underlaying population variances are equal).
(b) Plot the data in an appropriate and simple way and comment briefly on the validity of the
test of part ii).
2. On another day screws are put into boxes of 100. The table below gives the number of defectives
in twenty boxes sampled from this days production run.
5 15 18 12 8 7 9 14 11 10
6 18 14 9 18 12 11 5 18 12
Table 3.2: Number of defectives per box of 100 screws
(a) Carry out a test to establish whether there is a difference between the proportions of de-
fectives produced on the two days.
(b) Carry out a test to establish whether the proportion of defectives in boxes of 100 screws is
more than 9%.
Exercise 3.6: [wk10Q10, Solution, Schedule] Suppose that Y represents a single observation from
the probability density given by:
( 1
y , 0<y<1
fY (y|) =
0, elsewhere.
Find the most powerful test with significance level = 0.05 to test H0 : = 2 against Ha : = 1.
3.2 Properties of the hypothesis testing

Exercise 3.7: [wk08Q1, Solution, Schedule] Explain carefully the distinction between the signifi-
cance level and power.
Exercise 3.8: [wk08Q2, Solution, Schedule] Let X have a Bernoulli distribution where = Pr (X = 1).
Take a random sample of size n = 10 from this Bernoulli distribution and consider the test:
H0 : 1/2 versus Ha : > 1/2.

( )
10
Using the critical region C = (X1 , X2 , . . . , X10 ) : xk 6 :
P
k=1
90
1. Find the power function and sketch it.
2. Find the size of this test.
Exercise 3.9: [wk08Q3, Solution, Schedule] Recall from Exercise 3.2:

Let X1 , X2 , . . . , X10 be a random sample of size 10 from a Poisson distribution with mean . Consider
the critical region C defined by:
10

X
C= , , . . . , .

(x x x ) : x 3

1 2 10 k

k=1
Determine the power of the test under Ha .
Exercise 3.10: [wk08Q4, Solution, Schedule] Recall from Exercise 3.3:

Let X1 , X2 , . . . , Xn be a random sample from the density function:
!
1 1
fX (x|) = exp (x ) . 2
2 2
Determine the power of this test.
Exercise 3.11: [wk08Q5, Solution, Schedule] Prove:

X nj
k X X nj
k X k
X
(xi j x) =
2
(xi j x j )2 + n j (x j x)2
j=1 i=1 j=1 i=1 j=1
where
nj k X nj k
X xi j X xi j X njxj
xj = x= =
i=1
nj j=1 i=1
N j=1
N
Hint: 1) Rewrite the left side by adding and subtracting within the squares x j ;
2) Rewrite is using Binomial expansion (see F&T page 2).
Exercise 3.12: [wk08Q7, Solution, Schedule] The following observations represent weight loss (in
pounds) of men of similar physique, metabolic activity, and so on, after a certain amount of time on
three types of diet programs: A, B, and C.
Diet Program
A B C
3 2 7
7 4 10
4 6 8
5 6 9
6 5 4
- 3 8
- 4 -
Test for the differences in the mean weight loss between the three diet programs. State any assump-
tions you make. Provide the point estimates estimates of the mean losses, and the ANOVA table used
to partition the various sources of variation.
91

The following test concerning the mean claim amount () for a certain class of policy:
H0 : = 200 v.s. H1 : , 200,
is to be preformed. A random sample of 50 claims is examined and yields a mean amount of 207
and a standard deviation of 42. Calculate the approximate p-value for the test.

When comparing the mean premiums for policies issued by two companies, a two-sample t test is
preformed assuming equal population variances. The sample sizes and sample variances are given
by:
n1 = 25, s21 = 139.7
n2 = 30, s22 = 76.6
Preform an approximate F test at the 5% level to investigate the validity of the equal variance as-
sumption.

The following data refers to an outbreak of botulism, a form of food poisoning that may be fatal. Each
subject is a person who contracted botulism in the outbreak. The variables recorded are the subjects
age in years, the time in hours between eating the infected food and the first signs of illness (incubation
period) and whether the subject survived (denoted by survival category Y) or died (denoted by survival
category N).
Subject 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Age (x) 29 39 44 37 42 17 38 43 51 30 32 59 33 31 32 32 36 50
Incubation 13 46 43 34 20 20 18 72 19 36 48 44 21 32 86 48 28 16
period (y)
Survival N Y Y N N Y N Y N N N Y N N Y N Y N
x = 405 y = 305 x = 15517 y = 10035

P P P 2 P 2
Died:
x = 270 y = 339 x = 11396 y = 19665
P P P 2 P 2
Survived:
1. A scatterplot of incubation period against age is given below, in which different symbols are
used for subjects who died and for subjects who survived.
92
A Plot of Incubation Period against age

90
Died
Survived
80
70
60
Incub Per
50
40
30
20
10
15 20 25 30 35 40 45 50 55 60
Age
Comment briefly on any relationships between age and incubation period for those subjects
who died and for those who survived.
2. Construct suitable dotplots to investigate any relationship between:
(a) age and survival, and

(b) incubation period and survival
and make a brief informal comparison of the died and survived groups based on these dotplots.
3. Construct a 95% and 99% confidence intervals for the mean difference between the incubation
period for subjects who survived and subjects who dies (i.e., take the mean incubation period
for subjects who survived minus the mean incubation period for subjects who died).
Comment briefly on these confidence intervals.
4. (a) Construct a test to investigate whether the variances of the incubation period for subjects
who died and subjects who survived are equal.
(b) Comment on the validity of the assumptions that are required for the confidence intervals
given in part c) to be approximate.

It is desired to investigate the level of premium charged by two companies for contents policies for
houses in a certain area. Random samples of 10 houses insured by company A are compared with 10
similar houses insured by company B. The premiums charged in each case are as follows:
Company A 117 154 166 189 190 202 233 263 289 331
Company B 142 160 166 188 221 241 276 279 284 302
93
A = 2, 134, A2 = 494, 126, B = 2, 259, B2 = 541, 463.

P P P P
For these data:
1. Illustrate the data given above on a suitable diagram and hence comment briefly on the validity
of the assumptions required for a two-sample t test for the premiums of these two companies.
2. Assuming that the premiums are normally distributed, carry out a formal test to check that it is
appropriate to apply a two-sample t test to these data.
3. Test whether the level of premiums charged by company B was higher than that charged by
company A. State your p-value and conclusions clearly.
4. Calculate a 95% confidence interval for the difference between the proportions of premiums of
each company that are in excess of 200. Comment briefly on your result.
5. The average premium charged by company A in the previous year was 170. Formally test
whether company A appears to have increased its premium since the previous year.
3.3 Non-parametric Tests

Exercise 3.17: [wk09Q1, Solution, Schedule] A group of 1, 725 school children were cross-classified
according to their intelligence and their manner of clothing. A result of this classification is given
below:
dull intelligent very capable
very well clothed 81 322 233
well clothed 141 457 153
poorly clothed 127 163 48
Test for independence using a 1% level of significance.
1. Past Institute exam

A 2 2 contingency table was set up to investigate whether or not two classifications criteria
are independent and resulted in the following data:
I II
A 22 28 50
B 28 22 50
50 50 100
Calculate the observed 2 test statistic and state an approximate conclusion concerning the
independence of the two criteria.
2. (Added to the past Institute exam) Preform the Pearsons chi-square test using R.
Exercise 3.19: [wk09Q3, Solution, Schedule] Continued from previous question Exercise 3.18. Us-
ing the Fishers exact test:
1. Write down the corresponding hypothesis and test statistic.

2. Calculate the probability mass function of a Hypergeometric distribution with N = 100, M =
50, n = 50 and x = 22.
94
3. Use R to calculate show that the cumulative density function of a Hypergeometric distribution
with N = 100, M = 50, n = 50 and x = 22 equals 0.15867.
4. Preform the hypothesis testing.
5. Check your answer using R;
Exercise 3.20: [wk09Q4, Solution, Schedule] Compare the results in question 3.18. and 3.19. and
explain the differences/similiraties.

A particular area in a town suffers a high burglary rate. A sample of 100 streets is taken, and in each
of the sample streets, a sample of six similar houses is taken. The table below shows the number of
sampled houses, which have had burglaries during the last six months.
No. of houses burgled x 0 1 2 3 4 5 6

No. of streets f 39 38 18 4 0 1 0
1. (a) State any assumptions needed to justify the use of a binomial model for the number of
sampled houses per street which have been burgled during the last six months.
(b) Derive the maximum likelihood estimator of p, the probability that a house of the type
sampled has been burgled during the last six months.
(c) Fit the binomial model using your estimate of p, and, without doing a formal test, com-
ment on the fit.
2. An insurance company works on the basis that the probability of a house being burgled over a
six month period is 0.18. Carry out a test to investigate whether the binomial model with this
value of p provides a good fit for the data.
Exercise 3.22: [wk09Q6, Solution, Schedule] Check your answer of Exercise 3.21 using R.
Exercise 3.23: [wk09Q7, Solution, Schedule] Does education really make a difference in how much
money you will earn?1 Researchers randomly selected 100 people from each of three income categories
marginally rich, comfortably rich, and super richand recorded their education levels. The data
are summarised in the table that follows.
Highest Marginally Comfortably

Education Level Rich Rich Super Rich Total
No college 32 20 23 75
Some college 13 16 1 30
Undergraduate degree 43 51 60 154
Postgraduate study 12 13 16 41
Total 100 100 100 300
1. Describe the independent multinomial populations whose proportions are compared in the 2
analysis.
2. Provide a table with the observed proportions.

1
Extended version of [W+] 14.25
95
3. Do the data indicate that the proportions in the various education levels differ for the three
income categories? Test at the = 0.01 level.
4. Construct a 95% confidence interval for the difference in proportions with at least an under-
graduate degree for individuals who are marginally and super rich. Interpret the interval.
5. Use R to check the answer in 3) and 4).
96
Solutions
Solution 3.1: [wk07Q1, Exercise, Schedule] Distinction between terms:
1. The null hypothesis is the hypothesis being tested while the alternative hypothesis is the hy-
pothesis accepted if the null is rejected.
2. A one-tailed hypothesis is one where it is of the form of an inequality like > a or < a, while
a two-tailed is of the form , a.
3. A simple hypothesis is one where if true, it will completely specify the probability distribution,
otherwise it is called a composite hypothesis.
4. A Type I error is the mistake committed when the null hypothesis is rejected when it is in fact
true. On the other hand, a Type II error is the mistake committed when the null hypothesis is
accepted when it is in fact false.
Solution 3.2: [wk07Q2, Exercise, Schedule] The probability mass function for a Poisson is:
e x
pX (x) = .
x!
1. Thus, the best critical region is given by solving the Neyman-Pearson lemma:
L (x1 , . . . , xn ; 0 ) e0.1n (0.1) xk / xk !
P Q
=
L (x1 , . . . , xn ; 1 ) e0.5n (0.5) xk / xk !
P Q
=e0.4n (0.2) xk k
P
= (0.2) xk k1 (= k/e0.4n )
P
X
= xk log((0.2)) k2 (= log(k1 ))
X
xk k (= k2 / log(0.2))
is the form of the best critical region (note: log(0.2) < 0). Thus the best critical region is of the
form:
X
?
n o
C = (x1 , . . . , xn ) : xk k ,
where k is such that Pr ni=1 xi k |H0 = .
P
2. For the specific form of the critical region given in the problem, that is, where we reject the null
H0 when 10
P
k=1 xk 3, the level of significance is:

10
X X e1
= Pr xk 3 | = 0.1 =

x=3
x!
k=1
= 1 e1 + e1 + e1 /2 = 0.0803.
Solution 3.3: [wk07Q3, Exercise, Schedule] The density is actually that of a N (, 1) distribution
where the variance is known. The best critical region can be found by solving:
n P
L (x1 , . . . , xn ; 0 ) 1/ 2 exp 21 nk=1 xk2
= n
L (x1 , . . . , xn ; 1 )
P
1/ 2 exp 12 nk=1 (xk 1)2
n
1 X
= exp 2xk n k.

2 k=1
97
Thus, a little manipulation on this will lead us to:

n
X 1
xk log k + n.
k=1 | {z 2}
k
Therefore, the best critical region is of the form:

n
X
xk k
k=1
and to get the form of the constant k , we solve for:
Pr Reject H0 | = 0 =

the level of significance. Solving this, we get:

n

!
X k
xk k | = 0 = Pr Z

Pr
k=1
n
since we know that when = 0, the nk=1 Xk N (0, n). Since the probability is equal to , we have
P
k
= z1 . Thus, reject the null whenever:
n
n
X
xk n (z1 ) .
k=1
Solution 3.4: [wk07Q4, Exercise, Schedule] Testing from a Poisson distribution:
1. Using Neyman-Pearson lemma, the best critical region can be found by solving:
P
L (x1 , . . . , xn ; 0 ) en0 0 xk / xk !
Q
=
L (x1 , . . . , xn ; 1 )
P
en1 1 xk / xk !
Q
!P x k
0
= en(0 1 ) k,
1
which after some manipulation will lead us to:
0
X !
xk log log ken(0 1 )
1

X log ken(0 1 )
xk = k ,
0
log 1
where the inequality is reversed in the last step because:
0 0
!
1 > 0 = < 1 = log < 0.
1 1
xk k where k is determined from:
P
Thus, reject H0 whenever
n
X
xk k |0 = .

Pr
k=1
98
2. Note that since the sum of Poisson is another Poisson with parameters by simply adding the
Poisson parameters, the distribution of the test statistic nk=1 xk Poisson(n0 ). Therefore, k is
P
determined from
x
X en0 (n0 )

.

argmax

x!

k
x=k

1. (a) Test:
H0 : M = A v.s. H1 : M , A ,
where M is the population mean of the number of defectives in the morning and M is
the population mean of the number of defectives in the afternoon. Note that we are asked
to test whether there is a difference between in the means, which implies a two-sided
test. The test statistic is using the difference in mean, given unknown population variance,
which is assumed to be equal in the two samples. Note that the sample sizes are small, thus
we do not approximate the student-t distribution with the standard normal one. Hence, the
test statistic is:
(X M X A ) ( M A )
T= q
S p n1M + n1A
,s
(X M X A ) ( M A ) n? S 2p
,
= n? where n? = nA + n M 2
2
q
p n M + nA
1 1
} 2 (n? )/n?
| {z }
| {z
Z
(X M X A )
= q tnM +nA 2
S p n1M + n1A
* using the null hypothesis M A = 0, tnM +nA 2 is a student-t distribution with n M + nA 2

degrees of freedom.
The rejection region is C = {(x1 , . . . , xn )|T {(, tnM +nA 2,1/2 ) (tnM +nA 2,1/2 , )}}.
From the data we have:
nM
X nM
X
xi =212 xi2 = 4056 n M = 12
i=1 i=1
nA
X nA
X
xi =161 xi2 = 2811 nA = 10
i=1 i=1
From this we can calculate:

Pn M nM
xi 2 1 X
x M = i=1 = 212/12 = 170 s2M = xi2 n M x2M = 28.242

nM 3 n M 1 i=1
Pn A nA
i=1 xi 1 X 2
xM = = 16.1 sA = xi nA xA = 24.322
2 2

nA nA 1 i=1
s2M (n M 1) + s2A (nA 1) 11 28.242 + 9 24.322
s2p = = = 26.478.
n M + nA 2 20
99
Thus the value of our statistical test is:

212/12 16.1
q = 0.71.
26.478 121 + 101
Note that the significance level is not given in this exercise. Thus we have to find he
p-value of the test. From Formulae and Table page 163 we observe that the 1- quantile
student-t distribution with 20 degrees of freedom takes the value 0.6870 for = 0.25 and
0.8600 for = 0.20. Therefore, the p-value is close to 2 0.25 = 0.5 (somewhat lower).
Hence usually we consider p-values of 0.1, 0.05 or 0.01, for those p-values we would
reject the null hypothesis and accept the alternative, i.e., there is no statistical difference
in the number of defectives in the morning compared with the afternoon.
(b) See below a dotchart(note only stars are enough). The stars represent the observations
(lower, black morning observations, upper, blue afternoon). The + signs corresponds to
x 2s, x s, x, x + s, x + 2s, in the middle using pooled sample standard variance and the
upper and lower ones using the sample standard variance of the individual (e.g. morning
or afternoon) sample. If the data is normal, then know that 95% of the observations should
be smaller (larger) than the +2 (2) and approximately 2/3 of the observation should
lay in the interval ( , + ).
5 10 15 20 25 30
From this dotchart we observe that the equal variance assumption seems reasonable, the
normality assumption of the morning data seems reasonable, however of the afternoon
data the normality assumption seems questionable (perhaps due to small number of obser-
vations) with not a hump-shaped density function (i.e., we do not observe that there are
more observations around the mean) and the probability of large outliers is relatively large
(i.e., we observe some excess kurtosis).
100
2. (a) In this question we are interested in proportions, i.e., the probability of a defective screw.
We are testing:
H0 : p0 = p1 v.s. H1 : p0 , p1
where p0 is the (population) probability of a defective in the 150 screws sample day and
p1 is the (population) probability of a defective in the 100 screws sample day. Note that
we are asked to test whether there is a difference between the proportions, which implies
a two-sided test and that n and np are large, so we can use the normal approximation.
The test statistic of difference in proportions is (similar to the difference in mean when
variance -under the null- are equal):
p0 p1
Z= q N(0, 1)
p) n10 + n11
p(1 b
b
Note the difference with the example in the lecture notes in week 7, where the proportion
under the null hypothesis is given. In this case both p0 and p1 are random variables. Under
the null hypothesis of equal proportions, the best estimate of this proportion, denoted by
p is given by the average proportion in the two samples combined.
b
The rejection region is C = {(x1 , . . . , xn )|Z {(, z1/2 ) (z1/2 , )}}.
Hence, we have:
n0 =22 150 = 3300, n1 = 20 100 = 2000
373 232
p0 = = 0.11303, p1 = = 0.116
22 150 20 100
373 + 232 605
p= = = 0.11415.
22 150 + 20 100 5300
b
Then we have as our value of the test statistic is given by:

0.1180 0.1160
Z= q = 0.222.
1
0.11415 0.88585 3300 + 1
2000
Again, no level of significance is given, so we compute the p-value. From Formulae and
Tables page 160 we observe (0.22) = 0.58706 Hence, the p-value would be 2 (1
0.58706) = 0.82588. Thus the difference in the proportion is not significant at levels of
< 0.82588, which is usually the case.
(b) Now we have the following test:
H0 : p = e
p = 0.09 v.s. H1 : p = p1 > e
p
Note that one can also set H0 : p = e p 0.09, but this complicates the test statistic. It
would lead to the same statistic and critical value.
The test statistic now which corresponds with the one in the lecture notes:
p1 e
p
Z= p N(0, 1).
p (1 e
e p)/n1
The rejection region is C = {(x1 , . . . , xn )|Z (z1 , )}.
The value of this test statistic is, using that under the null hypothesis2 p = e
p = 0.09:
0.116 0.09
Z= = 4.063.
0.09 0.91/2000
2
In case of composite null hypothesis H0 : p1 0.09, you should select here the p1 (0, 0.09] which leads to the
highest Type I error (Pr(Reject H0 |H0 is true)), which is the highest if p1 = 0.09.
101
From Formulae and Tables page 161 we observe (4.06) = 0.99998 thus the correspond-
ing p-value is 1 (4.06) = 0.00002. Hence, for level of significance higher than 0.00002
(for example 5%) we can reject the null hypothesis that the proportion of defectives is 9%.
Hence we can conclusively disprove that the proportion is 9% and thus we have proven
that the proportion is larger than 9%.
Solution 3.6: [wk10Q10, Exercise, Schedule] For a single observation, note that L(Y|) = y1 .
Hence
L(Y| = 2) 2y
= = 2y, 0 < y < 1.
L(Y| = 1) 1
The form of the critical region of the best test is
2y < k,
or equivalently
k
y< = c.
2
To find c, note that = 0.05 is specified and
0.05 = Pr(y < c| = 2)

Zc
= 2ydy = c2 .
0

Therefore c = 0.05 = 0.2236 and the rejection region of the best test is defined by:
y < 0.2236,
i.e. reject H0 when y < 0.2236 for a 5% level of significance.
Solution 3.7: [wk08Q1, Exercise, Schedule] Distinction between terms level of significance is the
probability of committing a Type I error, while the power of the test is the probability that the null is
rejected when in fact it is false.
Solution 3.8: [wk08Q2, Exercise, Schedule] The power function is the probability of rejecting the
null hypothesis as a function of the parameter, while the size of the test is also the levelof significance,
and is the probability of committing a Type I error.
1. Let S = 10 k=1 Xk , the total number of successes in the sample. Clearly, S Binomial(n = 10, p = ) .
P
The power function is thus
() = Pr Reject H0 |H1 is true = Pr (S 6 | (0.5, 1])

= Pr (S = 6 | (0.5, 1]) + Pr (S = 7 | (0.5, 1]) + Pr (S = 8 | (0.5, 1])

+ Pr (S = 9 | (0.5, 1]) + Pr (S = 10 | (0.5, 1])
10 !
X 10
= k (1 )10k
k=6
k
which is clearly a function of . A sketch of the power function for various values of the
parameter is given on the next page. It was produced using the following R code:
102
theta=0:100/200+0.5
par(lab=c(10,10,7))
plot(theta,1-pbinom(5,10,theta),type="n",
ylab="Power",xlab="theta")
lines(theta,1-pbinom(5,10,theta),col=4)
1.00
0.90
0.80
0.70
Power
0.60
0.50
0.40
0.50 0.60 0.70 0.80 0.90 1.00
theta
2. The size of the test is given by:

= Pr Type I error

= Pr Reject H0 | 0.5

Pr Reject H0 | = 0.5

10 !
X 10
= 0.5k (0.5)10k
k=6
k
= 0.37695.
Solution 3.9: [wk08Q3, Exercise, Schedule] The power of the test is:
1 = 1 Pr Type II error

= 1 Pr Accept H0 |H1 is true

= Pr Reject H0 |H1 is true

10

X
= Pr Xk 3 = 0.5

k=1

X e5 5 x

=
x=3
x!

= 1 e5 + 5e5 + e5 52 /2 = 0.8753.
* using critical region: C = {(X1 , . . . , X10 ) : 10 k=1 xk 3} and H1 : = 0.5 from week 7 material. **
P
P10
using k=1X H POI(n) using X |H POI().
k 1 i 1
103
Solution 3.10: [wk08Q4, Exercise, Schedule] The power of the test is:
1 = 1 Pr Type II error

= 1 Pr Accept H0 |H1 is true

= Pr Reject H0 |H1 is true

n

X
= Pr

Xk n z1 H1 is true
k=1

n
X
= Pr Xk n z1 = 1

k=1

!
n (z1 ) n
= Pr Z
n

= Pr Z z1 n .

* using critical region: C = {(X1 , . . . , Xn ) : nk=1 xk n z1 } derived in week 7 material; ** using
P
H1 : = 1 from week 7 material; and *** using nk=1 Xk H1 N(n, n) using Xi |H1 N(1, 1).
P
This cannot be evaluated numerically unless of course the sample size is given.
Solution 3.11: [wk08Q5, Exercise, Schedule] We have:

X nj
k X nj
k X
X
(xi j x) =
2
((xi j x j ) (x x j ))2
j=1 i=1 j=1 i=1
nj
k X
X
= ((xi j x j )2 + (x x j )2 2 (xi j x j ) (x x j ))
j=1 i=1
X nj
k X nj
k X
X nj
k X
X
= (xi j x j ) +
2
(x x j ) 2
2 (xi j x j ) (x x j )
j=1 i=1 j=1 i=1 j=1 i=1
X nj
k X k
X nj
X k
X nj
X
= (xi j x j ) +
2
(x x j )2
1 2 (x x j ) (xi j x j )
j=1 i=1 j=1 i=1 j=1
|{z} |i=1 {z }
=n j =n j (x j x j )=0
X nj
k X k
X
= (xi j x j ) +
2
n j (x j x)2
j=1 i=1 j=1
Solution 3.12: [wk08Q7, Exercise, Schedule] To test for the difference in weight loss across different
diet programs, we assume the one-way ANOVA model which states that yi j , the weight loss of the jth
individual for diet program i = A, B, C satisfies:
yi j = + i + i j , for i = A, B, C, and j = 1, 2, . . . , ni
where i j refers to the random error with the usual assumption of zero mean and constant variance.
One can easily verify the sample means across diet programs are:
yA = 5.00, yB = 4.286, and yC = 7.667
which give the point estimates of the mean losses for each diet program, and the grand mean is:
y = 5.611.
104
The total sum of squares is:

X ni
I X 2
SST = yi j y
i=1 j=1
= (3 5.611)2 + (4 5.611)2 + . . .
= 84.28
and the sum of squares between the diet programs is:

I
X 2
SSB = ni yi. y
i=1
= 5 (5 5.611)2 + 7 (4.286 5.611)2 + 6 (7.667 5.611)2
= 39.52
so that the sum of squares within the diet programs is:
SSW = SST SSB = 84.28 39.52 = 44.76.
The one-way ANOVA table is then summarized below:
ANOVA Table for the One-Way Layout

Source d.f. Sum of Squares Mean Square F-Statistic
Between 2 39.52 19.76 19.76
2.98
= 6.63
Within 15 44.76 2.98
Total 17 84.28
Thus, to test:
H0 : A = B = C = 0 v.s. Ha : at least one is not zero,
we would reject the null hypothesis if the observed F-statistic > F1 (I 1, N I). Since
F = 6.63 > F0.95 (2, 15) = 3.68,
we then reject H0 and say that there is strong evidence that the mean losses across diet programs are
different.
Solution 3.13: [wk08Q8, Exercise, Schedule] The hypothesis is given in this question. To find the
test statistic we can apply the cental limit theorem, because n = 50 is large. Therefore, the test statistic
is:
X
Z= N(0, 1).
/ n
The rejection region is C = {(X1 , . . . , Xn ) : Z {(, z1/2 ) (z1/2 , )}}.
The value of the test statistic for the sample with n = 50, x = 207, and = 42 is given by:
207 200
Z= = 1.178511.
42/ 50
From Formulae and Tables page 160 we observe (1.17) = 0.87900. We have a two sided test
therefore p/2 = 1 (1.17) = 0.121 p = 0.242, i.e., the p-value is 0.242.
105
Solution 3.14: [wk07Q6, Exercise, Schedule] We have the hypothesis:
H0 : 21 = 22 v.s. H1 : 21 , 22 with = 0.05
The test statistic is given by:
22 S 12
F= F(n1 1, n2 1)
21 S 22
2
S
= 12 F(n1 1, n2 1),
S2
* using, under the null equal variances, thus the fraction of the variances are equal to one.
The rejection region is C = {(x1 , . . . , xn )|F {(0, 1/F1/2 (n2 1, n1 1))(F1/2 (n1 1, n2 1), )}}.
The upper critical value is is given by F(24, 29, 0.975) = 2.514 and the lower approximated by
F(24, 29, 0.025) = 1/F(29, 24, 0.975) 1/F(24, 24, 0.975) = 1/2.269 = 0.441 (see Formulae and
tables page 173), note two-sided test, therefore we have the 1 /2 for constructing the critical value.
The value of the test statistic is:
s21 139.7
F= = = 1.82.
s22 76.6
We reject the null hypothesis for large and small value of F, which is not the case. Hence, we cannot
reject the null hypothesis of equal variances at a 5% significance level.
1. There does not seem to be a relationship between age and incubation period for both individuals
who died and who survived. (There seems to be a (positive) relationship between surviving and
the incubation period, but this was not asked in this question).
2. For this we use the following dotplots (with the upper dots for the individuals who died (black
stars) and the lower for the individuals who survived (blue stars)).
106
Figure 3.1: Dotplot for age
20 25 30 35 40 45 50 55
Age
(a) The dotplot does not suggest a relationship between survival and age.
Figure 3.2: Dotplot for incubation period
20 30 40 50 60 70 80
Age
107
(b) The dotplot suggests a relationship between survival and incubation period, namely the
individuals who survived tended to have a longer incubation period.
3. We are interested in the difference in mean, with unknown population standard deviation.
Therefore, we have to assume that the population variance of the incubation period for the
survived and died individuals is equal to set up a test statistic (from the dotcharts we observe
the variance of died individuals might be higher than the survived individuals). Under this
assumption, and using the central limit theorem (which might be not a good approximation
because number of survived is 7 and number of died is 11, i.e., total sample size is 18) or when
both the incubation period for survived and the incubation period for died are normally dis-
tributed (which might be a good approximation looking at the dotcharts) we have the following
test statistic:
(Y S Y D ) (S D )
T= q tnS +nD 2 ,
S p nS + nD
1 1
where Y S , S , nS is the sample mean, population mean, and sample size of incubation period
for survived individuals, Y D , D is the sample mean, population mean, and sample size of incu-
bation period for died individuals, and S p is the sample. Note that sample size is small, thus we
have to use the t-distribution and not the standard normal distribution. We have:
nS = 7 nD = 11
yS = 339/7= 48.429 yD = 305/11 = 27.727
P 2
yS
P y 2 2
sS = nS 1 nS nS
2 nS S
= 6 7 339
7 19665
7
= 541.2857
P 2 P 2 2

y

s2D = nDnD1 nDD nyDD = 11
10
10035
11
30511
= 157.8182
(n 1)s2 +(n 1)s2
s2p = S nS S+nD 2D D
= 6541.2857+10157.8182
16
= 4825.8961
16
= 301.6185
The (1 ) 100% confidence interval of the difference in mean is given by:
r r
1 1 1 1
(xS xD ) t1/2,n1 +n2 2 s p + <S D < (xS + xD ) + t1/2,n1 +n2 2 s p +
n nD nS nD
rS r
1 1 1 1
20.702 t1/2,n1 +n2 2 17.3672 + <S D < 20.702 + t1/2,n1 +n2 2 17.3672 +
7 11 7 11
Using Formulae and Table page 163 we observe t0.975,16 = 2.120 and t0.995,16 = 2.921. Thus
the 95% confidence interval for the difference in mean is given by (2.9, 38.5) and the 99%
confidence interval for the difference in mean is given by (3.8, 45.2).
The 95% confidence interval for the difference in mean does not include zero, hence when
testing the hypothesis of equal mean versus the alternative of a difference in mean (two-sided)
with a significance level of 5% we would reject the null hypothesis.
However, the 99% confidence interval for the difference in mean does include zero, hence when
testing the hypothesis of equal mean versus the alternative of a difference in mean (two-sided)
with a significance level of 1% we cannot reject the null hypothesis.
4. (a) We preform the test:
H0 : 2S = 2D v.s. H1 : 2S , 2D = 5%
2D S S2
F = 2 2 F(n1 1, n2 1)
S S D
2
S
= 2s F(6, 10),
Sd
108
The rejection region is C = {(x1 , . . . , xn )|{(0, 1/F1/2 (n2 1, n1 1)) (F1/2 (n1 1, n2
1), )}}.
The upper critical value is is given by F(6, 10, 0.975) = 4.072 and the lower is given by
F(6, 10, 0.025) = 1/F(10, 6, 0.975) = 1/5.461 = 0.18312 (see Formulae and tables page
173), note two-sided test, therefore we have the 1 /2 for constructing the critical value.
s2S 541.2857
F= 2
= = 3.4298.
sD 157.8182
We reject the null hypothesis for large and small value of F, which is not the case. Hence,
we cannot reject the null hypothesis of equal variances at a 5% significance level.
Note that: F(6, 10, 0.95) = 3.271 (Formulae and Tables page 172), implying that we can
reject the null hypothesis of equal variances at a 10% significance level, and the p-value
is slightly smaller than 0.1.
(b) See answer question c).
Although the dotcharts suggests that there is a difference in variance, when formally test-
ing the hypothesis, we cannot reject the null hypothesis of equal variances (due to small
sample size which either causes the observed difference in sample variance when the pop-
ulation variances are equal or -in case of unequal population variances- the small sample
size leads to a low power of the test).
From the dotcharts we observe that the incubation period seems to be normally distribu-
tion for both the sample survived and the sample died.
1. See below a dotchart (note only stars are enough). The stars represent the observations (upper,
black Company A observations, lower, blue Company B observations). The + signs corre-
sponds to x 2s, x s, x, x + s, x + 2s, in the middle using pooled sample standard variance and
the upper and lower ones using the sample standard variance of the individual (e.g. Company
A or Company B) sample. If the data is normal, then we know that 95% of the observations
should be smaller (larger) than the + 2 ( 2) and approximately 2/3 of the observation
should lay in the interval ( , + ).
109
100 150 200 250 300 350

Premium
In order to apply the hypothesis test, the population mean of company A and company B should
be normally distributed with the same population variance. For the assumption of normally dis-
tribution of the population mean for company A and company B we cannot use CLT, because
that only holds for large n, which is not the case. Therefore, only if the underlaying population
is normally distributed, than the population mean is normally distributed.
From the dotcharts we observe that approximately 2/3 of the observation of both samples lay
within one sample standard deviation from the sample mean and no observations are smaller/larger
than the sample mean +/- 2 times the sample standard deviation. There seems to be a concen-
tration of the observations around the sample mean (i.e., hump-shaped p.d.f.). Therefore, we
cannot reject the assumption that the distribution of the premiums of company A and the pre-
miums of company B are normally distributed.
We observe that the sample variance of company A is larger than the sample variance of com-
pany B, but this might be due to the small sample size. Hence, we cannot reject the assumption
of equal variance from the dotcharts.
2. Assuming that the premiums are normally distributed, the only test is the test for equal vari-
ances. Hence, we test/the hypothesis is:
H0 : 2A = 2B v.s.H1 : 2A , 2B with = 0.05
2B S 2A
F= 2 2 F(nA 1, nB 1)
A SB
2
S
= 2A F(9, 9),
SB
The rejection region is C = {(x1 , . . . , xnA +nB )|F (0, 1/F1/2 (nB 1, nA 1)) (F1/2 (nA
110
1, nB 1), )}}.
The upper critical value is is given by F(9, 9, 0.975) = 4.026 and the lower critical value by
F(9, 9, 0.025) = 1/F(9, 9, 0.975) = 1/4.026 = 0.2484 (see Formulae and tables page 173), note
two-sided test, therefore we have the 1 /2 for constructing the critical value.
s2A 4303.4
F= 2 = = 1.243,
sB 3461.7
P P A 2 P P A 2
A2 2134 2 A2

where = s2A nA
na 1
nA
nA = 9 10 10
10 494126
= 4303.4 and s2B = nA nA
=
na 1
nA

2259 2

10
9
541463
10
10 = 3461.7.
We reject the null hypothesis for large and small value of F, which is not the case. Hence, we
cannot reject the null hypothesis of equal variances at a 5% significance level. Therefore, it is
reasonable to assume that 2A = 2B
Note that even F(9,9,0.9) = 2.440 (Formulae and Tables page 171), which we cannot reject the
null hypothesis of equal variance at a level of significance of 20%.
3. We want to test, i.e., the hypothesis is:
H0 : B = A v.s. H1 : B > A ,
or (note, this will result in the test statistic and the same critical value)
H0 : B A v.s. H1 : B > A .
The corresponding test statistic (note that the sample size is small, hence the student-t distribu-
tion cannot be approximated by the standard normal distribution) is:
(X B X A ) (B A )
T= q
S p n1B + n1A
(X B X A )
= q tnB +nA 2
S p n1B + n1A
* using the null hypothesis3 B A = 0, tnB +nA 2 is a student-t distribution with nB + nA 2 = 18

degrees of freedom. We reject for large values of the statistic, i.e., the rejection region is
C = {(x1 , . . . , xnA +nB )|T (t1 (nB + nA 2), )}.
From the data and part b) we can calculate:
Pn A Pn B
i=1 Ai Bi
xA = = 213.4 xB = i=1 = 225.9
na nB
s (nA 1) + sB (nB 1) 9 4303.4 + 9 3461.7
2 2
s2p = A = = 3882.5.
nA + n B 2 18
Thus the value of our statistical test is:
225.9 213.4
q = 0.4486.
3882.5 10 + 10
1 1
3
In case of composite null hypothesis, you should select here the A (, B ] which leads to the highest Type I error
(Pr(Reject H0 |H0 is true)), which is the highest if A = B .
111
Note that the significance level is not given in this exercise. Thus we have to find he p-value
of the test. From Formulae and Table page 163 we observe that the 1- quantile student-t
distribution with 18 degrees of freedom takes the value 0.5338 for = 0.3 and 0.2571 for
= 0.40. Therefore, the p-value is between to 0.3 and 0.4 (note: one sided test). Hence usually
we consider p-values of 0.1, 0.05 or 0.01, for those p-values we would reject the null hypothesis
and accept the alternative, i.e., there is no statistical larger premium charged by company B and
company A.
4. Let pA , pB be the (population) proportion of the proportion of the claims that are higher than
200 for Company A and B, respectively. In order to construct the confidence interval for the
difference in proportions we first need the test statistic:
pA b
(b pB ) (pA pB )
Z= q N(0, 1),
pA (1b
b
nA
pA )
+ nB
pB (1b
b pB )
note that, under the null hypothesis, nA pA = 5 and nB pB = 5 which is the minimum require-
ment as rule of thumb for a reasonable good approximation of a Binomial random variable by
a normal random variable, which is used in the test. the corresponding (two-sided) confidence
interval is given by:
s s
pA (1 b
pA ) bpB (1 b
pB ) pA (1 b
pA ) bpB (1 b
pB )
+ < (pA pB ) < (b pB ) + z1/2 +
b b
pA b
(b pB ) z1/2 pA b
nA nB nA nB
From the data we have b pA = 5/10 = 0.5 and b
pB = 6/10 = 0.6 and from Formulae and Tables
page 162 z0.975 = 1.96.
p p
0.1 1.96 0.25/10 + 0.24/10 < (pA pB ) < 0.1 + 1.96 0.25/10 + 0.24/10
0.53 < (pA pB ) <0.33,
thus the 95% confidence interval for the difference in proportion of premiums charged higher
than 200 is given by (0.53, 0.33).
This confidence interval contains the value zero, hence when testing the null hypothesis of equal
proportion of premiums charged higher than 200 versus a different in proportions (two-sided
test), we cannot reject the null hypothesis at a 5% significance level.
5. Now, we have the following hypothesis:
H0 : A = 170 v.s. H1 : A > 170
The test statistic is (recall small sample size nA = 10, so we cannot approximate the student-t
distribution by a standard normal one):
X A A
T= tnA 1
sA / nA
X A 170
= tnA 1
sA / nA
* assuming that the null hypothesis is true. Reject for large values of the statistic, i.e., the
rejection region is C = {(x1 , . . . , xnA )|T (t1 (nA 1), )}.
The value of the test statistic, given the calculation in b) (i.e., s2A = 4303.4) and c) (i.e., xA =
213.4), is given by:
213.4 170
T= = 2.092
340.34
112
From Formulae and Tables page 163 we observe t9,0.95 = 1.833 and t9,0.975 = 2.262. Therefore,
the p-value lays between 2.5% and 5%, i.e., testing at a 5% significance level would reject
the null hypothesis of no increase in the premium, whereas testing at a 2.5% significance level
would not lead to a rejection the null hypothesis of no increase in the premium.
Solution 3.17: [wk09Q1, Exercise, Schedule] The calculation of the expected value in each cell are
done in the table below. The expected value is simply the product of the (row total) to (column total)
and dividing it by the (grand total). You can easily verify the numbers:
dull intelligent very capable row total

very well clothed 81 (128.67) 322 (347.31) 233 (160.01) 636
well clothed 141 (151.94) 457 (410.11) 153 (188.95) 751
poorly clothed 127 (68.38) 163 (184.58) 48 (85.04) 338
column total 349 942 434 1,725
Thus, computing the test statistic, we have:

X Expected - Actual2
statistic
2
=
Expected
(128.67 81)2 (85.04 48)2
= + ... +
128.67 85.04
= 134.6854
From the chi-square table with degree of freedom equal to (row - 1) x (column - 1) = 4, we have:
2 value = 9.49
at = 5%. Thus, we would reject the null hypothesis of independence if the observed 2 statistic
exceed the 2 value and in this case, it does. Therefore, we conclude that based on the data, there is no
strong evidence to support the hypothesis that intelligence and manner of clothing are independent.
1. The hypothesis is given by:
H0 : classifications are independent v.s. H1 : classifications are dependent
Using the likelihood ratio test we find the approximate chi-squared test statistic:
X (Oi Ei )2
T= 21 ,
i{{A,B},{I,II}}
E i
Note that the degrees of freedom of the unconstraint model is two, i.e., the Pr(A = I) (which
result in Pr(A = II) = 1 Pr(A = I) and is therefore no extra degree of freedom) and Pr(B = I)
(which result in Pr(B = II) = 1 Pr(B = I) and is therefore no extra degree of freedom). In
the constraint model, i.e., under the null, we have Pr(I) = Pr(A = I) = Pr(B = I) and thus
Pr(A = II) = Pr(B = II) = 1 Pr(I), hence the only parameter is Pr(I) and thus the constraint
model has one degree of freedom. We will reject the null hypothesis for large value of the test
statistic (interpretation: large values of the test statistic corresponds to large value of (O E)2
and hence large deviations of what is expected under the null hypothesis, which is not likely)
In order to find the chi-squared test, we have to find the observed and expected numbers. We
have that the sum of each row and the sum of each columns equals 50. Therefore, under the null
113
hypothesis that the classification criteria where independent, the expected number of each cell
should be 501/2 = 25. The 1/2 is due to the probability that an observation (either in A or B) is
equal to I is 50/100 = 1/2 (using column totals), note that this is under the null hypothesis our
best estimate of the proportion. Thus we have the following observed and expected numbers:
Observed I II Expected I II
A 22 28 50 A 25 25 50
B 28 22 50 B 25 25 50
50 50 100 50 50 100
and the corresponding observed minus expected:
Observed-Expected I II
A 3 3
B 3 3
Hence, the value of our test statistic is:
32
T =4 = 1.44.
25
From Formulae and Tables page 164 we observe that Pr(21 1.44) = 0.77. Hence, our p-value
is 1 0.77 = 0.23. Thus, for levels of significance of 0.23 or less (usually the case) the is not
evidence of dependence of the two criteria.
2. R-code for the Pearson Chi-squared test:
> data < matrix(c(22,28,28,22),nrow=2,byrow=T) #create 2 2 matrix of the data
> chisq.test(data,correct=F) #preform the test
This is the same as we we calculated in the previous question.
1. The hypothesis is that the classifications are independent (two sided):

H0 : classifications are independent v.s. H1 : classifications are dependent
Or alternatively, in terms of the following table:
N11 N12 n1
N21 N22 n2
n1 n2 n
we have the following hypothesis:

N11 N12 N11 N12
H0 : = v.s. H1 : , .
n1 n2 n1 n2
The corresponding test statistic is given by:
T = N11 Hypergeometric(N, M, n)
We will reject the null hypothesis for small and large values of this statistic.
2. Let X Hypergeometric(N, M, n) with N = 100, M = 50, n = 50. Then we have:
M NM 50 50
x
nx 22
5022
pX (x = 22) = N = 100 = 0.07806943
n 50
114
3. R-code for the cumulative density function:

> p=c()
> for(x in 1:22){p[x]=choose(50,x)*choose(50,50-x)/choose(100,50)} # p is a vector with com-
ponents the probability mass of the Hypergeometric distribution
> sum(p) # the cumulative density function
4. Step 1 (defining the hypothesis) and step 2 (defining the test statistic) have been done in question
i). We now need to find the corresponding p-value of this test. To do so we use the cumulative
distribution function. We find that Pr(X 22) = 0.15867 (see question iii). We would reject
the null hypothesis if Pr(X 22) /2 or Pr(X 22) 1 /2. The smallest for which this
holds is obtained by Pr(X 22) = /2 = 0.15867 p-value is 2 0.15867 = 0.3173. Hence,
for reasonable levels of significance (less than 31%) the test cannot reject the null hypothesis
of independence.
5. R-code for the Fisher test:

> data < matrix(c(22,28,28,22),nrow=2,byrow=T) #create 2 2 matrix of the data
> fisher.test(data) #preform the Fisher test
We observe that the p-value is 0.3173, so we cannot reject reject the null at a 5% significance
level. Also the 95% confidence interval of the odd ratio: Nn,111 / Nn,122 includes the value of one,
hence using the confidence interval we can also say that they are not unequal and have to accept
the null hypothesis.
Solution 3.20: [wk09Q4, Exercise, Schedule] Both test cannot reject the null hypothesis at reason-
able levels of significance. However, the p-values substantially differs, i.e., p-value is 0.3173 in the
Fisher test and p-value is 0.2301 in the chi-squared test. This is due to the fact that the chi-squared
test uses an approximated distribution (normal for the Binomial one for the observed numbers), which
holds if np with x = 22 and x = 28 this should give a good approximation. Hence, therefore
we cannot reject the null hypothesis in both test. However, since the chi-squared test is only an
approximated test, this explains to the difference in the p-value.
1. (a) There are two assumptions which follows from the fact a Binomial distribution is the sum
of i.i.d. Bernoulli random variables. Hence, the two assumptions are that the probability
of house being burgled is independent from other houses being burgled (the independent
part of i.i.d.) and that each house in should have the same probability of being burgled
(the identically distributed part of i.i.d.).
(b) We have that the number of houses per street which are burgled has a Bin(6, p) distribution.
Each street is an observation of the random variable X Bin(6, p) the number of houses in
a street in the sample which are burgled the past six months. Thus the Likelihood function
115
is given by:
100
Y
L(p; x) = fX (xi )
i=1
= (Pr (X = 0))39 (Pr (X = 1))38 (Pr (X = 2))18 (Pr (X = 3))4
(Pr (X = 4))0 (Pr (X = 5))1 (Pr (X = 6))0
! !39 ! !38
6 6
= 0
p (1 p) 6
1
p (1 p) 5
0 1
! !18 ! !4
6 2 4 6 3 3
p (1 p) p (1 p)
2 3
! !0 ! !1 ! !0
6 4 2 6 5 1 6 6 0
p (1 p) p (1 p) p (1 p)
4 5 6
Hence, the log-likelihood function is given by:

100
X
`(p; x) = log(L(p; x)) = log( fX (xi ))
i=1
!
6
=39 log + 39 log(p0 ) + 39 log((1 p)6 )
0
!
6
+ 38 log + 38 log(p1 ) + 38 log((1 p)5 )
1
!
6
+ 18 log + 18 log(p2 ) + 18 log((1 p)4 )
2
!
6
+ 4 log + 4 log(p3 ) + 4 log((1 p)3 )
3
!
6
+ log + log(p5 ) + log((1 p)1 )
5

=const + (0 + 38 + 36 + 12 + 5) log(p)
+ (234 + 190 + 72 + 12 + 1) log(1 p)
=const + 91 log(p) + 509 log(1 p)
* using log(ab cd ) = b log(a)

+ d log(c)
and **
using log(ab ) = b log(a) and const=
39 log 60 + 38 log 61 + 18 log 62 + 4 log 63 + log 65 . To find bp, the MLE estimate of p, we
differentiate the log-likelihood function with respect to p and equate it equal to zero:
`(p; x) 91 509
=0 =0
p p 1 p
91 509 91
= 91(1 p) = 509p 91 = 600p b
p= .
p 1 p 600
Checking whether it is indeed a maximum, i.e, the second derivative should be negative:
2 `(p; x) 91 509
= 2 < 0,
p 2 p
b p)2
(1 b
p=
Hence, b 91
600
is indeed the maximum of the log-likelihood function and hence the MLE.
116
(c) The probabilities of the Binomial distribution are given by:

!
n
Pr(X = x) = p x (1 p)nx , for x = 0, 1, 2, . . . , n and zero otherwise
x
With n = 6 and p = 91/600 we get the following probabilities for x = 0, 1, . . . , 6: 0.373,

0.400, 0.179, 0.043, 0.006, 0.000, 0.00. Hence, the expected number of street with the
number of houses burgled equal to x = 0, 1, . . . , 6 is given by n = 100 times this probabil-
ity. From this we can construct the following table:
# of streets 0 1 2 3 4 5 6
Observed # burgles 39 38 18 4 0 1 0
Expected # burgles 37.3 40.0 17.9 4.3 0.6 0.0 0.0
The observed and expected number of streets with burgles equal to 0, 1, . . . , 6 are similar,
which implies a good fit.
2. We construct the following test/hypothesis:
H0 : p = 0.18 provides a good fit v.s. H1 : p = 0.18 does not provide a good fit
For a goodness of fit test, we use the chi-squared test statistic with k bins:
k
X (Oi Ei )2
T= 2k1 ,
i=1
Ei
Note that in this hypothesis p is given and thus not estimated, hence we do not have to reduce
the degree of freedom with the number of parameters estimated. We reject the null hypothesis
for large value of the test statistic.
Similar to question a)iii), i.e., X Bin(n, p), with n = 6, but now with p = 0.18 we have the
probabilities for x = 0, 1, . . . , 6: 0.3040, 0.4004, 0.2197, 0.0643, 0.0106, 0.0009, 0.000. Hence,
the expected number of street, given the estimate of p = 0.18, with the number of houses
burgled equal to x = 0, 1, . . . , 6 is given by m = 100 times this probability. From this we can
construct the following table:
# of streets 0 1 2 3 4 5 6
Observed # burgles 39 38 18 4 0 1 0
Expected # burgles 30.40 40.04 21.97 6.43 1.06 0.09 0.0
Since the expected number of burgles is less than 5 for # burgles per street is equal to 4, 5, and
6. Therefore, we have to aggregate cells in order to obtain only cells which have an expected
number of street with this burgled larger than or equal to 5. Aggregating cells 4, 5, and 6
would only lead to an aggregate of 1.15, which is also substantial smaller than 5, therefore we
aggregate cells 3, 4, 5, and 6 (i.e., 3 or more burgles in a street) which result in an aggregate of
7.58.
# of streets 0 1 2 3+
Observed # burgles 39 38 18 5
Expected # burgles 30.40 40.04 21.97 7.58
117
The value of our test statistic is equal to:
(39 30.40)2 (38 40.04)2 (18 21.97)2 (5 7.58)2

T= + + + = 4.13
30.40 40.04 21.97 7.58
The degrees of freedom of the chi-squared test statistic is equal to 4-1=3 (# bins-1).
The level of significance is not given in the question, so we calculate the p-value. From
Formulae and Tables page 164 we observe Pr(23 4.2) = 0.7593, hence the p-value is
1 Pr(23 4.2) = 0.2407. We do not reject the null hypothesis that p = 0.18 provides a
good fit of the data for level of significance smaller than 0.2407, which is usual the case.
Solution 3.22: [wk09Q6, Exercise, Schedule] The R-code is given by:

> Burgled < c(39,38,18,4,0,1,0) #vector with observed number of streets with 0, . . . , 6 burgles
> BurgledPred < 100*dbinom(0:6,6,.18) #dbinom(x,n,p) gives the probability mass function of a
Binomial(n,p) distribution;
#dbinom(0:6,6,.18) gives a vector with p.m.f. for x = 0, . . . , 6;
#100*dbinom(0:6,6,.18) give thus the expected frequencies for the number of streets with 0, . . . , 6
burgles
#COMBINING CELLS INTO THE GROUP 3+

> Burgled2 < c(Burgled[1:3],sum(burgled[4:7])) # vector with first 3 elements the first 3 elements
of Burgled and as fourth element the sum of the fourth till seventh element of Burgled
> BurgledPred2 < c(BurgledPred[1:3],sum(burgledPred[4:7]))
# PREFORM TEST
> chisq.test(Burgled2,y=BurgledPred2,rescale.p=T) #chisq.test(x (observed), y = NULL Hypothesis
(expected), TRUE then p is rescaled (if necessary) to sum to 1 if FALSE, and p does not sum to 1, an
error is given)
Help on this function: see also:

http://stat.ethz.ch/R-manual/R-patched/library/stats/html/chisq.test.html
We find a p-value of 0.2471. Note the solution last week had a p-value of 0.2407, the difference was
due to rounding to value of the test statistic from 4.13 to 4.2 to use Formulae and Tables table.
1. We are looking at 3 populations here: the MR (marginally rich), the CR (comfortably rich) and
the SR (super rich). For each of these populations we group members into one of four education
groups, thus creating a multinomial classification of each of the populations.
2. The observed proportions are given by the cell by the sum of their column total.
Table 3.3: Observed frequencies

Education Level Rich Rich Super Rich Total
No college 0.32 0.20 0.23 0.25
Some college 0.13 0.16 0.01 0.10
Undergraduate degree 0.43 0.51 0.60 0.51333
Postgraduate study 0.12 0.13 0.16 0.13667
Total 1 1 1 1
118
3. We will test the following hypothesis:
H0 :prob MR, CR, and SR given education level are equal

v.s.
H1 :prob MR, CR, and SR given edu level is unequal for at least one edu level
Or equivalently:
H0 :pedu,MR = pedu,CR = pedu,S R for edu = NC, SC, UD, PD

v.s.
H1 :pedu,MR , pedu,CR orpedu,MR , pedu,S R for at least one edu = NC, SC, UD, PD

X (Observedi Expectedi )2
T= 2 ( f ),
i{{MR,CR,S R}{NC,S C,UD,PD}}
Expectedi
where f = 6 is the degree of freedom. Note that we have to use the (maximum likelihood)
estimate of the proportion of proportions to find the expected frequencies and that the proportion
of (for example) PG given education level can be computed by NC,SC,UG given education
level. Therefore, the degree of freedom of the test is equal to (4 1) (3 1). Note, we reject
the null hypothesis for large values of the test statistic.
The critical value value of our test statistic is given by Pr(26 16.81) = 0.01 (See Formulae
and Tables page 169). Hence, we reject the null hypothesis if T > 16.81 or C = (16.81, ).
To calculate the test statistic we we the expected numbers under the null hypothesis:
Table 3.4: Expected numbers

Education Level Rich Rich Super Rich
No college 25 25 25
Some college 10 10 10
Undergraduate degree 51.33 51.33 51.33
Postgraduate study 13.67 13.67 13.67
Note all expected cell values are larger than 5, so we do not have to combine cells. The value
of our test statistic is given by:
(32 25)2 + (20 25)2 + (23 25)2 (13 10)2 + (16 10)2 + (1 10)2
T= +
25 10
(43 51.33)2 + (51 51.33)2 + (60 51.33)2
+
51.33
(12 13.67) + (13 13.67)2 + (16 13.67)2
2
+ = 19.17233
13.67
We observe that the value of the test statistic is in the rejection region C, so we reject the null
hypothesis of probabilities of MR, CR, and SR given the education level are equal at a level of
significance of 1%.
119
4. We are asked to find the 95% confidence interval for the difference in proportions with at least
an undergraduate degree for individuals who are marginally and super rich. The corresponding
number of observations are given in the table below
Table 3.5: Observed frequencies

Highest Marginally
Education Level Rich Super Rich Total
NC, SC 45 24 69
UD, PS 55 76 131
Total 100 100 200
The proportions are given by:
Table 3.6: Observed proportions

Highest Marginally
Education Level Rich Super Rich Total
NC, SC 0.45 0.24 0.345
UD, PS p MR = 0.55 b
b pS R = 0.76 0.655
Total 1 1 1
Using the CLT (proportion is the mean) the pivotal quantity is given by:
p MR b
(b pS R ) (p MR pS R )
T= q N(0, 1).
p MR (1b
b
n MR
p MR )
+ nS R
pS R (1b
b pS R )
Thus, the 100(1 )% confidence interval is given by:

s
p MR (1 bp MR ) bpS R (1 bpS R )
+ < p MR pS R
b
(bp MR b
pS R ) z1/2
n MR nS R
s
p MR (1 bp MR ) bpS R (1 bpS R )
< (b pS R ) + z1/2 +
b
p MR b
n MR nS R
Thus the 95% confidence interval is given by: (-0.33850849, -0.08149151)

We observe that zero is not in the confidence interval, thus when testing the null hypothesis that
the proportions are equal against the alternative that they are unequal at a level of significance
5%, we can reject the null hypothesis.
5. Part c)
> rich < matrix(c(32,20,23,13,16,1,43,51,60,12,13,16),nrow=4,byrow=T)
> E < chisq.test(rich,correct=F)$expected;print(E) #(displays the expected cell values, use
this to check whether all cells5)
> chisq.test(rich,correct=F)
Part d)
>p1.hat < sum(rich[3:4,1])/100
>p3.hat < sum(rich[3:4,3])/100
>diff < p1.hat-p3.hat
>lower < diff+qnorm(.025)*sqrt(p1.hat*(1-p1.hat)/100+p3.hat*(1-p3.hat)/100
120
>upper < diff+qnorm(.975)*sqrt(p1.hat*(1-p1.hat)/100+p3.hat*(1-p3.hat)/100)

c(lower,upper)
Answer 95% confidence interval: (-0.33850849, -0.08149151).
121
Module 4
Linear Regression
4.1 Simple Linear Regression

Exercise 4.1: [wk10Q1, Solution, Schedule] Consider the exponential regression model with one
independent variable:
Yi = 0 0 xi ei for each i = 1, 2, . . . , n,
where the i s are independent and identically distributed normal random variables with E[i ] = 0 and
Var(i ) = 2 .
1. Rewrite the exponential regression model as a linear regression model with parameters and
and describe the relationship between and 0 and the relationship between and 0 .
Derive the following from the linear regression model:
= ni=1 ci log(yi ) where ci = (xi x)/S xx and S xx = ni=1 (xi x)2 .

P P
2. b
|X = x] =
3. E[b
|X = x] = 2 /S xx
4. Var(b
|X = x) N(1 , 2 /S xx )
5. (b
|X = x] =
6. E[b
x2
!
1
|X = x) =
7. Var(b 2
+
n S xx
x2
!!
1
|X = x) N ,
8. (b 2
+
n S xx
|X = x) = S xxx
2
, b
9. Cov(b
10. What are the distributions of Y,

b0 and b0 conditional on (X = x) using the LSE estimates in the
linear regression model and the relationship found in question (a)?
Exercise 4.2: [wk10Q2, Solution, Schedule] Forensic scientists use various methods for determin-
ing the likely time of death from post-mortem examination of human bodies. A recently suggested
objective method uses the concentration of a compound (3-methoxytyramine or 3-MT) in a particular
122
part of the brain.

In a study of the relationship between post-mortem interval and the concentration of 3-MT, samples
of the approximate part of the brain were taken from coroners cases for which the time of death had
been determined form eye-witness accounts. The intervals (x; in hours) and concentrations (y; in parts
per million) for 18 individuals who were found to have died from organic heart disease are given in
the following table. For the last two individuals (numbered 17 and 18 in the table) there was no
eye-witness testimony directly available, and the time of death was established on the basis of other
evidence including knowledge if the individuals activities.
Observation Interval Concentration

number (x) (y)
1 5.5 3.26
2 6.0 2.67
3 6.5 2.82
4 7.0 2.80
5 8.0 3.29
6 12.0 2.28
7 12.0 2.34
8 14.0 2.18
9 15.0 1.97
10 15.5 2.56
11 17.5 2.09
12 17.5 2.69
13 20.0 2.56
14 21.0 3.17
15 25.5 2.18
16 26.0 1.94
17 48.0 1.57
18 60.0 0.61
x = 337 x2 = 9854.5 y = 42.98 y2 = 109.7936 xy = 672.8

P P P P P
In this investigation you are required to explore the relationship between concentration (regarded the
responds/dependent variable) and interval (regard as the explanatory/independent variable).
1. Construct a scatterplot of the data. Comment on any interesting features of the data and discuss
briefly whether linear regression is appropriate to model the relationship between concentration
of 3-MT and the interval from death.
2. Calculate the correlation coefficient for the data, and use it to test the null hypothesis that the
population correlation coefficient is equal to zero.
3. Calculate the equation of the least-squares fitted regression line and use it to estimate the con-
centration of 3-MT:
(a) after 1 day and
(b) after 2 days.
Comment briefly on the reliability of these estimates.
4. Calculate a 99% confidence interval for the slope of the regression line. Using this confidence
interval, test the hypothesis that the slope of the regression line is equal to zero. Comment on
your answer in relation to the answer given in part (2) above.
123
Exercise 4.3: [wk10Q3, Solution, Schedule] Past Institute exam.

Consider a linear regression model in which responses Yi are uncorrelated and have expectations xi
and common variance 2 (i = 1, . . . , n), i.e. Yi is modelled as a linear regression through the origin:
E[Yi |xi ] = xi and V(Yi |xi ) = 2 (i = 1, . . . , n).
(a) Show that the least squares estimator of is b

1 = ni=1 xi Yi / ni=1 xi2 .
P P
1.
1 under the model.
(b) Derive the expectation and variance of b
2. An alternative to test the least squares estimator in this case is:

n
X n
X
2 =
b Yi / xi = Y/x.
i=1 i=1
2 under the model.

(a) Derive the expectation and variance of b
(b) Show that the variance of the estimator b2 is at least as large as that if the least squares
1 .
estimator b
3. Now consider an estimator b of which is a linear function of the responses, i.e. an estimator
Pn 3
which has the form 3 = i=1 ai Yi , where a1 , . . . , an are constants.
b
3 is unbiased for if ni=1 ai xi = 1, and that the variance of b

3 is ni=1 a2i 2 .
P P
(a) Show that b
1 and b
2 above may be expressed in the form b 3 = ni=1 ai Yi and
P
(b) Show that the estimators b
hence verify that b1 and b2 satisfy the condition for unbiasedness in 3.(a).
(c) It can be shown that, subject to condition ni=1 ai xi = 1, the variance of b 3 is minimised by
P
setting ai = xi / i=1 xi . Comment on this result.
Pn 2
Exercise 4.4: [wk10Q4, Solution, Schedule] A university wishes to analyse the performance of its
students on a particular degree course. It records the scores obtained by a sample of 12 students at
the entry to the course, and the scores obtained in their final examinations by the same students. The
results are as follows:
Student A B C D E F G H I J K L
Entrance exam score x (%) 86 53 71 60 62 79 66 84 90 55 58 72
Final paper score y (%) 75 60 74 68 70 75 78 90 85 60 62 70
x = 836 y = 867 x2 = 60, 016 y2 = 63, 603 (x x)(y y) = 1, 122

P P P P P
1. Calculate the fitted linear regression equation of y on x.
2. Assuming the full normal model, calculate an estimate of the error variance 2 and obtain a
90% confidence interval for 2 .
3. By considering the slope parameter, formally test whether the data is positively correlated.
4. Find a 95% confidence interval for the mean finals paper score corresponding to an individual
entrance score of 53.
5. Test whether this data come form a population with a correlation coefficient equal to 0.75.
124
6. Calculate the proportion of variance explained by the model. Hence, comment on the fit of the
model.
Exercise 4.5: [wk10Q5, Solution, Schedule] Complete the following ANOVA table for a simple
linear regression with 60 observations:
Source D.F. Sum of Squares Mean Squares F-Ratio

Regression
Error 8.2
Total 639.5
Exercise 4.6: [wk10Q6, Solution, Schedule] Suppose you are interested in relating the accounting
variable EPS (earnings per share) to the market variable STKPRICE (stock price). Then, a regression
equation was fitted using STKPRICE as the response variable with EPS as the regressor variable.
Following is the computer output from your fitted regression. You are also given that: x = 2.338,
y = 40.21, S x = 2.004, and S y = 21.56.
Regression Analysis
The regression equation is
STKPRICE = 25.044 + 7.445 EPS
Predictor Coef SE Coef T p

Constant 25.044 3.326 7.53 0.000
EPS 7.445 1.144 6.51 0.000
Analysis of Variance
SOURCE DF SS MS F p
Regression 1 10475 10475 42.35 0.000
Error 46 11377 247
Total 47 21851
1. Calculate the correlation coefficient of EPS and STKPRICE.

2. Estimate the STKPRICE given an EPS of $2. Provide a 95% confidence interval of your esti-
mate.
3. Provide a 95% confidence interval for the slope coefficient .
4. Compute s and R2 .
5. Describe how you would check if the errors have constant variance.
6. Perform a test of the significance of EPS in predicting STKPRICE at a level of significance of
5%.
7. Test the hypothesis H0 : = 24 against Ha : > 24 at a level of significance of 5%.
Exercise 4.7: [wk10Q7, Solution, Schedule] (Modified from an Institute of Actuaries exam problem)
An insurance company issues house buildings policies for houses of similar size in four different
post-code regions A, B, C, and D. An insurance agent takes independent random samples of 10 house
buildings policies for houses of similar size in each of the four regions. The annual premiums (in
dollars) were as follows:
125
Region A : 229
P 241 270 256 241 247 261 243 272 219
x = 2, 479, x = 617, 163
P 2
Region B : 261
P 269 284 268 249 255 237 270 269 257
x = 2, 619, x = 687, 467
P 2
Region C : 253 247 244 245 221 229 245 256 232 269
x = 2, 441, x = 597, 607
P P 2
Region D : 279 268 290 245 281 262 287 257 262 246
x = 2, 677, x = 718, 973
P P 2
Perform a one-way analysis of variance at the 5% level to compare the premiums for all four regions.
State briefly the assumptions required to perform this analysis of variance.
Exercise 4.8: [wk10Q8, Solution, Schedule] You are given the following one-way ANOVA model:
Yi j = + i + i j , for i = 1, . . . , I and j = 1, . . . , J
where the error terms i j are i.i.d. normal random variables with mean 0 and common variance 2 .
Using fundamental principles of maximum likelihood, derive the maximum likelihood estimates for
all parameters in the model.
Exercise 4.9: [wk10Q9, Solution, Schedule] For the one-way ANOVA model derive the following
maximum likelihood estimators:
X ni
I X ni
I X
X
Yi j Yi j
i=1 j=1 i=1 j=1
=Y=
1. b =
I
X N
ni
i=1
ni
X
Yi j
j=1
i = Y i. Y =
2. b Y
ni
Exercise 4.10: [wk10Q11, Solution, Schedule] Past Institute Exam (April 2005)
As part of an investigation into health service funding a working party was concerned with the issue
of whether mortality could be used to predict sickness rates. Data on standardised mortality rates and
standarised sickness rates collected for a sample of 10 regions and are shown in the table below:
Region Mortality rate m (per 100,000) Sickness rate s (per 100,000)

1 125.2 206.8
2 119.3 213.8
3 125.3 197.2
4 111.7 200.6
5 117.3 189.1
6 100.7 183.6
7 108.8 181.2
8 102.0 168.2
9 104.7 165.2
10 121.1 228.5
126
Data summaries:
m = 1136.1, m2 = 129, 853.03, s = 1934.2, s2 = 377, 700.62, and ms = 221, 022.58.
P P P P P
1. Calculate the correlation coefficient between the mortality rates and the sickness rates and de-
termine the probability-value for testing whether the underlaying correlation coefficient is zero
against the alternative that it is positive.
2. Noting the issue under investigation, draw an appropriate scatterplot for these data and comment
on the relationship between the two rates.
3. Determine the fitted linear regression of sickness rate on mortality rate and test whether the
underlaying slope coefficient can be considered to be as large as 2.0.
4. For a region with mortality rate 115.0, estimate the expected sickness rate and calculate 95%
confidence limits for this expected rate.
Exercise 4.11: [wk10Q12, Solution, Schedule] Past Institute Exam (September 2005)
The data given in the following table are the number of deaths from AIDS in Australia for 12 consec-
utive quarters starting from the second quarter of 1983.
Quarter (i) 1 2 3 4 5 6 7 8 9 10 11 12
Number of deaths (ni ) 1 2 3 1 4 9 18 23 31 20 25 37
1. (a) Draw a scatterplot of the data.

(b) Comment on the nature of the relationship between the number of deaths and the quater
in this early phase of the epidemic.
2. A statistician has suggested that a model of the form:
E[ni ] = i2
might be appropriate for these data, where is a parameter to be estimated from the data above.
She has proposed two methods for estimating , and these are given in part i. and ii. below.
(a) Show that the least squares estimate of , obtained by minimising q = 12 i=1 (ni i ) is
2 2
P
given by:
P12 2
i ni
b = Pi=112 4
.
i=1 i
(b) Show that an alternative (weighted) least squares estimate of , obtained by minimising
P (ni i2 )2
q = 12
i=1 i2
is given by:
P12
ni
= Pi=1
e 12 2
.
i=1 i
i4 = 60, 710 and i2 = 650, calculate b

and e
for the data above.
P12 P12
(c) Noting that i=1 i=1
3. To assess whether the single parameter model which was used in part b) is appropriate for the
data, a two parameter model is considered. The model is of the form:
E[Ni ] = i
for i = 1, . . . , 12.
127
(a) To estimate the parameters and , a simple linear regression model
E[Yi ] = + xi
is used, where xi = log(i) and Yi = log(Ni ) for i = 1, . . . , 12. Relate the parameters and
to the regression parameters and .
(b) The least squares estimates of and are -0.6112 and 1.6008 with standard errors 0.4586
and 0.2525 respectively (you are not asked to verify these results).
Using the value for the estimate , conduct a formal statistical test to assess whether the
form of the model suggested in (b) is adequate.
Exercise 4.12: [wk10Q13, Solution, Schedule] Past institute Exam

Consider the following data, which comprise of four groups sizes (y), each comprising four obser-
vations, In scenario I, information is also given on the sum assured under the policy concerned - the
sum assured is the same for all four policies in a group. In scenario II, we regard the policies in the
different groups as having been issued by four different companies - the policies in a group are all
issued the same company.
All monetary amounts are in units of 10, 000. Summaries of the claim sizes in each group are given
in a second table.
Group 1 2 3 4
Claim sizes y 0.11 0.46 0.52 1.43 1.48 2.05 1.52 2.36
0.71 1.45 1.84 2.47 2.38 3.31 2.95 4.08
I: sum assured x 1 2 3 4
II: Company A B C D
Summaries of claim sizes:
Group 1 2 3 4
P
y 2.73 6.26 9.22 10.91
P 2
y 2.8303 11.8018 23.0134 33.2289
1. In scenario I, suppose we adopt the linear regression model
Yi = + xi + i
where Yi is the ith claim size and xi is the corresponding sum assured, i = 1, . . . , 16.
(a) Calculate the total sum of squares and its partition into the regression (model) sum of
squares and the residual (error) sum of squares.
(b) Fit the model and calculate the fitted values for the first claim size of group 1 (namely
0.11) and the last claim size of group 4 (namely 4.08).
(c) Consider a test of the hypothesis H0 : = 0 against a two-sided alterative. By preform-
ing appropriate calculations, assess the strength of the evidence against this no linear
relationship hypothesis.
2. In scenario II, suppose we adopt the analysis of variance model
Yi j = + i + ei j
where Yi j is the jth claim size for company i and i is the ith company effect, i = 1, 2, 3, 4 and
j = A, B, C, D.
128
(a) Calculate the partition of the total sum of squared into the between companies (model)
sum of squares and the within companies (residual/error) sum of squares.
(b) Fit the model.
(c) Calculate the fitted values for the first claim size of group 1 and the last claim size of group
4.
(d) Consider a test of hypothesis H0 : i = 0, i = A, B, C, D against a general alternative. By
preforming appropriate calculations, assess the strength of the evidence against this no
company effects hypothesis.
4.2 Multiple Linear Regression

Exercise 4.13: [wk11Q1, Solution, Schedule] Consider the regression model
Yk = xk + k , for each k = 1, 2, . . . , n,
that is, the regression with one regressor variable variable but without the intercept term. This model
is called regression through the origin because the true regression line passes through the point (0,0).
Derive the least squares estimate of .
Now, consider the quadratic regression model passing through the origin;
Yk = xk2 + k , for each k = 1, 2, . . . , n.
Use the previous result to determine the least squares estimate of .
Exercise 4.14: [wk11Q2, Solution, Schedule] Use the following steps to establish a relationship
between the coefficient of determination and the correlation coefficient:
1. Show that:
yk y = b
b (xk x) .
2. Use this result to show that:

n
X
SSM = yk y 2 = b
2 s2x (n 1) .

b
k=1
where s2x is the sample variance of X.
3. Use the previous result to establish:
s2x
R2 = b
2 = r2 .
S y2
where s2x , s2y is the sample variance of X and Y, respectively.
Exercise 4.15: [wk11Q3, Solution, Schedule] In the regression model Yk = + xk + k , use algebra
to establish the following results:
n 2 s2
1. R2 = 1 , where s2y is the sample variance of Y.
n 1 s2y
129
r
n1
2. s = sy 1 r2
, where sy is the sample standard deviation of Y.
n2
r
r2
= = n2
b
3. t b
se b 1 r2
1. Write down the design matrix for the simple linear regression model.
2. Write out the matrix X > X for the simple linear regression model.
3. Write out the matrix X > Y for the simple linear regression model.
4. Write out the matrix (X > X)1 for the simple linear regression model.
= (X > X)1 X > Y using your results above.

5. Calculate b
Exercise 4.17: [wk11Q5, Solution, Schedule] The following model was fitted to a sample of super-
markets in order to explain their profit levels:
y = 0 + 1 x1 + 2 x2 + 3 x3 +
where
y = profits, in thousands of dollars
x1 = food sales, in tens of thousands of dollars
x2 = nonfood sales, in tens of thousands of dollars, and
x3 = store size, in thousands of square feet.
The estimated regression coefficients are given below:
1 = 0.027
b and 2 = 0.097
b and 3 = 0.525.
b
Which of the following is TRUE?
(A) A dollar increase in food sales increases profits by 2.7 cents.
(B) A 2.7 cent increase in food sales increases profits by a dollar.
(C) A 9.7 cent increase in nonfood sales decreases profits by a dollar.
(D) A dollar decrease in nonfood sales increases profits by 9.7 cents.
(E) An increase in store size by one square foot increases profits by 52.5 cents.
Exercise 4.18: [wk11Q6, Solution, Schedule] In a regression model of three explanatory variables,
twenty-five observations were used to calculate the least squares estimates. The total sum of squares
and regression sum of squares were found to be 666.98 and 610.48, respectively. Calculate the ad-
justed coefficient of determination.
130
(A) 89.0%
(B) 89.4%
(C) 89.9%
(D) 90.3%
(E) 90.5%
Exercise 4.19: [wk11Q7, Solution, Schedule] In a multiple regression model given by:
y = 0 + 1 x1 + . . . + p1 x p1 + ,
which of the following gives a correct expression for the coefficient of determination?
I. SSM
SST
II. SST SSE
SST
III. SSM
SSE
(A) I only
(B) II only
(C) III only
(D) I and II only
(E) I and III only
Exercise 4.20: [wk11Q8, Solution, Schedule] The ANOVA table output from a multiple regression
model is given below:
ANOVA Table
Source D.F. SS MS F-Ratio Prob(> F)
Regression 5 13326.1 2665.2 13.13 0.000
Error 42 8525.3 203.0
Total 47 21851.4
Compute the adjusted coefficient of determination.
(A) 52%
(B) 56%
(C) 61%
(D) 63%
(E) 68%
131
Exercise 4.21: [wk11Q9, Solution, Schedule] You have information on 62 purchases of Ford auto-
mobiles. In particular, you have the amount paid for the car (y) in hundreds of dollars, the annual in-
come of the individuals (x1 ) in hundreds of dollars, the sex of the purchaser (x2 , 1 = male and 0 = female),
and whether or not the purchaser graduated from college x3 , 1 = yes and 0 = no . After examining

the data and other information available, you decide to use the regression model:
y = 0 + 1 x1 + 2 x2 + 3 x3 + .
You are given that:

0.109564 0.000115 0.035300 0.026804
1 0.000115 0.000001 0.000115 0.000091
X> X =
0.035300 0.000115 0.102446 0.023971
0.026804 0.000091 0.023971 0.083184

2 .
and the mean square error for the model is s2 = 30106. Calculate se b
(A) 0.17
(B) 17.78
(C) 50.04
(D) 55.54
(E) 57.43
Exercise 4.22: [wk11Q10, Solution, Schedule] Suppose in addition to the information in question
9., you are given:

9 558
4 880 937
X > Y = .
7 396

6 552
Calculate the expected difference in the amount spent to purchase a car between a person who gradu-
ated from college and another one who did not.
(A) 233.5
(B) 1 604.3
(C) 2 195.3
(D) 4 920.6
(E) 6 472.1
Exercise 4.23: [wk11Q11, Solution, Schedule] A regression model of y on four independent vari-
ables x1 , x2 , x3 and x4 has been fitted to a data consisting of 212 observations and the computer output
132
from estimating this model is given below:
Regression Analysis
The regression equation is
y = 3894 - 50.3 x1 + 0.0826 x2 + 0.893 x3 + 0.137 x4
Predictor Coef SE Coef T

Constant 3893.8 409.0 9.52
x1 -50.32 9.062 -5.55
x2 0.08258 0.02133 3.87
x3 0.89269 0.04744 18.82
x4 0.13677 0.05303 2.58
Which of the following statement is NOT true?
(A) All the explanatory variables has a positive influence on y.
(B) The variable x1 is a significant variable.
(C) The variable x2 is a significant variable.
(D) The variable x3 is a significant variable.
(E) The variable x4 is a significant variable.
Exercise 4.24: [wk11Q12, Solution, Schedule] In a multiple regression model, which of the follow-
ing gives a correct expression for the unbiased estimate of 2 ?
>
(A) 1
np+1
Y Xb
Y Xb
>
1
(B) np+1
Y b
Y Y bY
(C) 1
n1
Y >Y
>
1
(D) n1
Y b
Y Y bY
>
(E) 1
np
Y Xb
Y Xb
Note: p is the rank of X.
Exercise 4.25: [wk11Q13, Solution, Schedule] The estimated regression model of fitting life ex-
pectancy from birth (LIFE EXP) on the countrys gross national product (in thousands) per population
(GNP) and the percentage of population living in urban areas (URBAN%) is given by:
LIFE EXP = 48.24 + 0.79 GNP + 0.154 URBAN%.
For a particular country, its URBAN% is 60 and its GNP is 3.0. Calculate the estimated life ex-
pectancy at birth for this country.
(A) 49
(B) 50
133
(C) 57
(D) 60
(E) 65
Exercise 4.26: [wk11Q14, Solution, Schedule] What is the use of the scatter plot of the fitted values
and the residuals?
(A) to examine the normal distribution assumption of the errors

(B) to examine the goodness of fit of the regression model
(C) to examine the constant variation assumption of the errors
(D) to test whether the errors have zero mean
(E) to examine the independence of the errors
Exercise 4.27: [wk11Q15, Solution, Schedule] For the case of the multiple regression model, show:
|X = x] =
1. E[b
|X = x) = 2 (X > X)1
2. Var(b
1. Suggest why H = X(X > X)1 X > is called the hat matrix.
2. Show that HH > = H 2 = H.
3. Explain why:
X
yi = hii yi +
b hi j y j ,
j,i
where hi j is the (i, j)th element of H.

4. Show that the (i, j)th element of H is given by:
1 (xi x)(x j x)
+
n S xx
for the special case of the simple linear regression model.
e = Y b
5. Using H, write down an expression for the vector of residuals b Y. (Were back in the
multiple linear regression setting.)
e|X = x].
6. Using H, calculate E[b
e|X = x).
7. Using H, calculate Var(b

ei /(s 1 hii ),
8. Explain why the ith standardised residual in a multiple regression model is given byb
where
v
n
t
1 X 2
s= e .
n p j=1 j
134
Solutions
1. Consider the given exponential regression model. First, transform the regression equation so
that you have a linear regression form by taking the logarithms of both sides:
log(yi ) = log(0 ) + xi log(0 ) + i

= + xi + i .
where = log(0 ) and = log(0 ).
2. Now, consider the sum of squares:

n
X n
X
S S (, ) = i2 = log(yi ) xi
2
i=1 i=1
and differentiating with respect to the parameters and setting to zero, that is,
n
X
S S (, ) = (2) log(yi ) xi = 0

i=1
n
X
S S (, ) = (2xi ) log yi xi = 0.

i=1
Rearranging and simplifying leads us to the following normal equations:

n n
X X
n + xi =
log(yi )
n i=1
n
i=1
n
X X X
xi + xi = xi log (yi ) .
2

i=1 i=1 i=1
135
Solving, we get:
n
X n
X
=
b log (yi ) /n b
xi /n
i=1 i=1
=log(y) bx
ni=1 xi
Pn P
i=1 xi log (yi ) b
=
b Pn 2
i=1 xi
( ni=1 xi )2
P
+
Pn Pn
x
i=1 i log (y i ) log(y) x
i=1 i
b
= Pn 2
n
i=1 xi
Pn 2 Pn Pn
i=1 xi xi log (y i ) log(y) i=1 xi
= i=1

1 Pn 2 b

Pn 2
n i=1 xi i=1 xi
Pn Pn
i=1 xi log (yi ) log(y) i=1 xi
=
Pn 2 (Pni=1 xi )2
i=1 xi
Pnn Pn
j=1 log(y j )xi /n
Pn
i=1 xi log (yi ) i=1
= Pn 2 2
i=1 xi nx
X n
xi /n nj=1 log(y j )
Pn P
(y
i=1 xi log i )
|i=1{z }
=x
=
nx2
Pn 2
i=1 xi
Pn
(xi x) log (yi )
= i=1Pn 2 2
i=1 xi nx
n
X
= ci log(yi )
i=1
3.
n
h i X x i x
|X = x =E =

E b Pn 2 2
log(y i )|X x
i=1 x
i=1 i nx
n
X xi x h i
= Pn 2 2
E log(yi )|X = x
i=1 i=1 xi nx
n
X xi x
= Pn 2 2
( + xi )
i=1 i=1 xi nx
=0
z }| {
Xn
xi x Pn
i=1 xi (xi x)
= Pn 2 + Pi=1
i=1 xi nx2 n
i=1 xi2 nx2
| {z }
=1
=
136
4.
n
X
|X = x =Var ci log(Yi )|X = x

Var b
i=1
n
X
= c2i Var log(Yi )|X = x
i=1
n
X
=2 c2i
i=1
Pn
x)2
i=1 (xi
= P2
2 2
n

2
x
i=1 i nx
2
= Pn
i=1 xi2 nx2
5. We have that (i |X = x) N(0, 2 ) and log(Y) = + xi + i for i = 1, . . . , n. From that

it follows that (log(Yi )|X = x) N( + xi , 2 ), because log(Yi ) is a linear function of the
constants and . Note that we have here population parameters, which are constant (but
unknown) and not Prandom variables itself. Because
(|X = x) is a linear combination of log(yi )
(we have = E i=1 Pn x2 nx2 log(yi )|X = x ) it must hold that (|X = x) has also a normally
n xi x
i=1 i
distribution, with parameters mean and variance as given in (3) and (4).
6.
h i h i
E b|X = x =E y b x|X = x
n
X h i
=E log(yi )/n|X = x E b
|X = x x

i=1
n
1X h i
= E log(Yi )|X = x x
n i=1
n
1X
= ( + xi ) x
n i=1
n
1 X
= (n + ) xi x
n
n i=1
X xi
= + x =

i=1
n
137
7.

|X = x =Var log(y) b
Var b x|X = x

=Var log(y)|X = x + x2 Var b |X = x

2xCov log(y), b |X = x
n
X log(yi )
=Var |X = x + x2 Var b |X = x

i=1
n
n n

X log(yi ) X
, ci log(yi )|X = x

2xCov
n
i=1 i=1
n
1 X
= 2 Var log(yi )|X = x + x2 Var b |X = x
n i=1
n n
2x X X
ci Cov log(yi ), log(y j )|X = x
n i=1 j=1
1

= Var log(yi )|X = x + x2 Var b |X = x
n
n
2x X
ci Var log(yi )|X = x
n i=1
n
2 x 2 2 22 x X
= + Pn 2
+ ci
n i=1 (xi x) n i=1
|{z}
=0(due to n
P
i=1 (xi x))
x2
!
1
= 2
+ Pn 2
n i=1 (xi x)

* using Cov log(yi ), log(y j )|X = x is equal to zero if i , j and equal to Var log(yi )|X = x if
i = j.
8. We have that is a linear combination of two normally distributed random variables, i.e.,
(log(Y)|X = x) and (|X = x), which is thus also normally distributed. The mean and vari-
ance are given in question (6) and (7).
9.

Cov b |X = x =Cov log(y) b
, b x, b
|X = x

=Cov log(y), b
|X = x xCov b , b
|X = x
| {z }
=0(see (7))
2
x
= Pn 2
i=1 (xi x)
10. We have 0 = exp(), 0 = exp(). Moreover, we have that (|X = x), (|X = x) and log(Y) are
normally distributed with their mean and variance as given in (5), (6), and (7). Thus, (0 |X =
x), (0 |X = x) and (Y|X = x) are lognormally distributed with parameters the mean of the
logarithm of the variable and2 the variance
of the logarithm of the variable. For example, 0
is E[|X = x] and 0 is Var |X = x .
2
138
3.5
2.5
Concentration
1.5
0.5
0 10 20 30 40 50 60
Interval (x)
1. Interesting features are that, in general, the concentration of 3-MT in the brain seems to de-
crease as the post mortem interval increases. Another interesting feature is that we observe two
observations with a much higher post mortem interval than the other observations.
The data seems to be appropriate for linear regression. The linear relationship seems to hold,especially
for values of interval between 5 and 26 (we have enough observations for that). Care should
be taken into account when evaluating y for x lower than 5 and larger than 26 (only two ob-
servations) because we do not know whether the linear relationship between x and y still holds
then.
2. We test:
H0 : = 0 v.s. H1 : , 0
The corresponding test statistic is given by:

R n2
T= tn2 .
1 R2
We reject the null hypothesis for large and small values of the test statistic.
We have n = 18 and the correlation coefficient is given by:
P
xi yi nxy
r=q
( xi2 nx2 )( y2i ny2 )
P P
672.8 18 337/18 42.98/18

=p = 0.827
(9854.5 3372 /18) (109.7936 42.982 /18)
Thus, the value of our test statistic is given by:

0.827 16
T= p = 5.89.
1 (0.827)2
139

From Formulae and Tables page 163 we observe Pr(t16 4.015) = Pr(t16 4.015) = 0.05%,
* using symmetry property of the student-t distribution. We observe that the value of our test
statistic (-5.89) is smaller than -4.015, thus our p-value should be smaller than 20.05% = 0.1%.
Thus, we can reject the null hypothesis even at a significance level of 0.1%, hence we can
conclude that there is a linear dependency between interval and concentration. Note that the
alternative hypothesis is here a linear dependency and not negative linear dependency, so you
do accept the alternative by rejecting the null hypothesis. Although, when you would use
as alternative hypothesis negative dependency, you would accept this alternative, due to the
construction of the test we have to use the phrase a linear dependency and not a negative
linear dependency.
3. The linear regression model is given by:
y = + x +
The (BLUE) estimate of the slope is given by:

xi yi n xi /n yi /n
P P P
= P 2
b
xi n( xi /n)2
P
672.8 337 42.98/18

= = 0.0372008
9854.4 3342 /18
The (BLUE) estimate of the intercept is given by:
=y b
b x
=42.98/18 + 0.0372008 337/18 = 3.084259
Thus, the estimate of y given a value of x is given by:
y =b
b +bx
=3.084259 0.0372008x
(a) One day equals 24 hours, i.e., x = 24, thusb

y=b
+b
24 = 3.0842590.037200824 = 2.19
(b) Two day equals 48 hours, i.e., x = 48, thusb
y=b
+b
24 = 3.0842590.037200848 = 1.30
The data set contains accurate data up to 26 hours, as for observations 17 and 18 (at 48 hour
and 60 hours respectively) there was no eye-witness testimony direct available. Predicting 3-
MT concentration after 26 hours may not be advisable, even though x = 48 is within the range
of the x-values (5.5 hours to 60 hours).
4. The pivotal quantity is given by:
b
tn2 .
)
s.e.(b
140
We have:
s
2
) =
b
s.e.(b
xi2 nx2
P
s
2
=
b
9854.5 3372 /18
( xi yi xi yi /n)2
P P P !
1 X 2 X 2
=
2
yi ( yi ) /n P 2 P 2
xi ( xi ) /n
b
n2
(672.8 337 42.98/18)2
!
1
= 109.7936 42.98 /18
2
= 0.1413014
16 9854.5 3372 /18
r
0.1413014
) =
s.e.(b = 0.00631331
9854.5 3372 /18
From Formulae and Tables page 163 we have t16,10.005 = 2.921.

Using the test statistic, the 99% confidence interval of the slope is given by:
t16,1/2 s.e.(b
b ) < < b
+ t16,1/2 s.e.(b
)
0.0372008 2.921 0.00631331 < < 0.0372008 + 2.921 0.00631331
0.055641979 < < 0.0188
Thus the 99% confidence interval of is given by: (0.0372008, 0.0188).
Note that = 0 in not within the 99% confidence interval, therefore we would reject the null
hypothesis that equals zero and accept the alternative that , 0 at a 1% level of significance.
This confirms the result in (2) where the correlation coefficient was shown to not equal zero at
the 1% significance level.
1. (a) The least squares estimator of minimizes:

Xn n
X n
X
S () = (yi xi )2 = y2i + 2 xi2 2 (yi xi )
i=1 i=1 i=1
Differentiating S () with respect to and set it equal to zero gives:

n n

S () X 2 X
0= = 2

xi (yi xi )
i=1 i=1
Solving for we obtain the LSE estimator for :

Pn
i=1 (yi xi )
1 = P
b
n 2
.
i=1 xi
1 is given by:
(b) The mean value of b
"Pn #
h i (y i xi )
1 =E Pn 2
E b i=1
x
Pn i=1 i
(E yi |xi xi )
= i=1Pn 2
x
Pn i=1 i
i=1 (xi xi )
= P n 2
=
i=1 xi
141
h i
1 given a value of xi only depends on the value of yi , hence the E yi |xi

* using that E b
with the condition and ** using E yi |xi = xi .

For the variance we have:
Pn !
(y i xi )
Var b 1 =Var Pn 2 i=1
i=1 xi
Pn 2
i=1 (xi Var (yi |xi ))
= Pn 2
2
i=1 xi
2
= Pn .
i=1 xi2
2 = ni=1 Yi / ni=1 xi is given by:

P P
2. (a) The expected value of the alternative estimator b
"Pn #
h i Y i
E b2 =E Pn i=1
x
Pn i=1 i
E [Yi |xi ]
= i=1Pn
x
Pn i=1 i
xi
= Pi=1
n = .
i=1 xi
The variance of the estimator is given by:

Pn !
i=1 Yi

Var 2 =Var Pn
b
i=1 xi
Pn
Var (Yi |xi )
= i=1Pn 2
i=1 xi

Pn 2
= i=1 2
(nx)
n2 2
= 2 = 2.
n2 x nx
2 ) Var(b
(b) We need to prove: Var(b 1 ) which is equivalent to prove that Var(b
2 )Var(b
1 )
0.
2
2
2 ) Var(b
Var(b 1 ) = 2 Pn 2
nx i=1 xi
!
1 1
=
2
Pn 2 0
nx2 i=1 xi
Xn Xn

xi2 nx2 = (xi x)2 = (n 1)s2x 0
i=1 i=1
142
where s2x is the sample variance of X, * using

n
X n
X
(xi x) =2
xi2 + x2 2 (xi x)
i=1 i=1
n
X n
X
= xi2 + nx 2x
2
xi
i=1 i=1
Xn
= xi2 + nx2 2nx2
i=1
n
X
= xi2 nx2 .
i=1
Thus the variance of the estimator b 2 is at least as large as the variance of the least squares
estimator 1 and is strictly larger if there is variability in the value xi can take.
b
3 = ni=1 ai Yi . The mean of the estimator is:

P
3. (a) Our estimator is now b
n
h i X
3 =E ai Yi

E b
i=1
n
X
= ai E [Yi |xi ]
i=1
n
X n
X
= ai xi = ai xi .
i=1 i=1
h i
3 is unbiased we have E b 3 = , which is only the case if ni=1 ai xi = 1.
P
Thus if b
The variance of the estimator is given by:
n
X
3 ) =Var ai Yi

Var(b
i=1
n
X
= a2i Var (Yi |xi )
i=1
n
X
= a2i 2 .
i=1
1 we have:
(b) For b
Pn
xi Yi
1 = Pi=1
b
n 2
i=1 xi
n
X xi
= Pn Yi ,
i=1 i=1 xi2
hence ai = Pnxi x2 for i = 1, . . . , n.
i=1 i
We need to verify the condition ni=1 ai xi = 1:
P
n n
X X xi
ai xi = Pn xi
i=1 i=1 i=1 xi2
Pn
i=1 xi2
= Pn = 1.
i=1 xi2
143
2 we have:
For b
Pn
xYi
2 = Pi=1
b n
i=1 xi
n
X 1
= Pn Yi ,
i=1 i=1 xi
hence ai = Pn1 xi = nx1 for i = 1, . . . , n.
i=1
We need to verify the condition ni=1 ai xi = 1:
P
n n
X X 1
ai xi = Pn xi
i=1 i=1 i=1 xi
P n
xi
= Pi=1
n = 1.
i=1 xi
3 is the general notation of a linear estimator. The condition ni=1 ai xi = 1
P
(c) We have that b
implies that we only look at unbiased estimators. This means that the linear estimator
with ai = Pnxi x2 , which is the least squares estimator, is the best (i.e., minimum variance)
i=1 i
unbiased estimator (BLUE estimator).
1. The linear regression model is given by:

yi = + xi + i ,
where i N(0, 2 ) i.i.d. distributed for i = 1, . . . , n.
The fitted linear regression equation is given by:
by=b +b x.
The estimated coefficients of the linear regression model are given by (see Formulae and Tables
page 25):
s xy 1122
=
b = Pn 2 2
s xx i=1 xi nx
1122 1122
= 2
= = 0.63223
60012 12 Pn 836 P1774.67
n
i=1 yi xi
=y x =
b b i=1
b
n n
867 836
= 0.63223 = 28.205.
12 12
Thus, the fitted linear regression equation is given by:
y = 28.205 + 0.63223 x.
b
2. The estimate for 2 is given by:

n
1 X
=
b 2
yi )2
(yi b
n 2 i=1
n Pn 2
1 X 2 i=1 (xi x)(yi y)
= 2

yi n y Pn 2 2

n 2 i=1 x
i=1 i nx
2
!
1 1122
= 63603 = 25.289
10 60016 8362 /12
144
We now the pivotal quantity:

s2
2n2
2 /(n 2)
Note: we have the degree of freedom of n 2 because we have to estimate two parameters form
and b
the data (b ). We have that s2 = b2 . Thus we have that the 90% confidence interval is
given by:
10b 2 10b2
< < 2
2
20.95,10 0.05,10
10 25.289 10 25.289
<2 <
18.3 3.94
13.8 < < 64.2
2
Thus the 90% confidence interval of 2 is given by (13.8, 64.2).

3. i) We test the following:
H0 : = 0 v.s. H1 : > 0,
with a level of significance = 0.05.

ii) The test statistic is:

T= q
b
tn2
/ i=1 (xi x)
c2
Pn 2

iii) The rejection region of the test is given by:
C = {(X1 , . . . , Xn ) : T (, t10,10.05 )} = {(X1 , . . . , Xn ) : T (, 1.812)}
iv) The value of the test statistic is given by:

0.63223 0 0.63223 0
T= q = p = 5.296.
25.289/( i=1 xi nx2 )
Pn 2 25.289/(60016 8362 /12)
v) The value of the test statistic is in the rejection region, hence we reject the null hypothesis of
a zero correlation.
4. We have that yi |xi by|xi has a student-t distribution:
Var(yi |xi )
yi |xi b
y|xi
p tn2
Var(yi |xi )
The predicted value is given by:
y|xi = b
b +b
xi = 28.205 + 0.63223 53 = 61.713.
The estimated variance of the observation x = 53 is give by:

(x x)2
!
1
Var(yi |xi = 53) = + Pn 2

c2
n (x
i=1 i x)
(53 836/12)2 c2
!
1
= + = 6.0657.
12 60016 8362 /2
145
Thus, the 95% confidence interval for the value of y given that x = 53 is given by:
y t10.05/2 Var(yi |xi = 53) < y|x = 53 < b

y + t10.05/2 Var(yi |xi = 53)
p p
b

61.713 2.228 6.0657 < y|x = 53 < 61.713 + 2.228 6.0657
56.2 < y|x = 53 < 67.2
Thus the 95% confidence interval of y given x = 53 is (56.2, 67.2).
5. i) We test the following hypothesis:
H0 : = 0.75 v.s. H1 : , 0.75
ii) The test statistic is given by:

Zr z
T= q N(0, 1)
1
n3
iii) The critical region is given by:
C = {(X1 , . . . , Xn ) : T {(, z1/2 ) (z1/2 , )}}
iv) The value of the test statistic is given by:

Zr z
q = 3(zr z ) = 3(1.2880 0.97296) = 0.94512
1
9
where
1+r 1 + 0.85860
! !
1 1
zr = log = log = 1.2880
2 1r 2 1 0.85860
1+ 1 + 0.75
! !
1 1
z = log = log = 0.97296
2 1 2 1 0.75
Pn
(xi x)(xi y)
r = pPn i=1 Pn
2 2
i=1 (xi x) i=1 (yi y)
1122
= Pn 2
( i=1 yi ny2i )( ni=1 xi2 nx2 )
P
1122
= = 0.85860
962.25 1774.667
v) We have that z0.82894 = 0.95. Thus, the p-value is given by 2 (1 0.82894) = 0.34212. The
value of the test statistic is not in the critical region if the level of significance is lower than
0.34212 (which is normally the case). Hence, for reasonable values of the level of significance
we would not reject the null hypothesis.
146
6. The proportion of the variability explained by the model is given by:

SSM SSE
R2 = =1
SSTP SST
n
(y i yi )2
=1 Pi=1
b
n 2
i=1 (yi yi )
( ni=1 (xi x)(yi y))2
P
Pn 2 2
i=1 iy ny Pn 2 2
i=1 xi nx
=1 Pn 2 2
i=1 yi nyi
Pn 2
i=1 (xi x)(yi y)
= Pn 2
( i=1 yi ny2i )( ni=1 xi2 nx2 )
P
11222
= = 0.737193.
962.25 1774.667
Hence, a large proportion of the variability of Y is explained by X.
Solution 4.5: [wk10Q5, Exercise, Schedule] The completed ANOVA table is given below:
Source D.F. Sum of Squares Mean Squares F-Ratio

Regression 1 639.5-475.6=163.9 163.9 8.2
=19.99
163.9
Error 58 8.2*58=475.6 8.2

Total 59 639.5
Solution 4.6: [wk10Q6, Exercise, Schedule] A simple linear regression problem:

s
1. Since we know that b = r syx , then r = b
SS yx = 7.445(2.004/21.56) = 69.2%. where s x , sy are
sample standard deviations. Alternatively,
you can use the fact that R2 = r2 , so that from (d)
below, r = 0.4794 = r = + 0.4794 = 69.2%. You take the positive square root because of
2
the positive sign of the coefficient of EPS .

2. Given EPS = 2, we have:
S T KPRICE
[ = 25.044 + 7.445 (2) = 39.934.
A 95% confidence interval of this estimate is given by:

s
1 (x x0 )2
!
+ x0 t1/2,n2 s
b b +
n (n 1) s2x
s
(2.338 2)2
!
1
= (39.934) t10.025,46 247 +
| {z } 48 (47) 2.0042
=2.012896
= 39.934 4.636 = (35.298, 44.570) .
where s2x is the sample variance of X.

3. A 95% confidence interval for is:

247
t1/2,n2 se b
b = 7.445 2.0147
2.004 47
= 7.445 2.305
= (5.14, 9.75) .
147

4. s = 247 = 15.716 and R2 = SSM
SST =
10475
21851
= 47.94%.
5. A scatter plot or diagram of the fitted values against the residuals (standardised) will provide us
an indication of the constancy of the variation in the errors.
6. To test for the significance of the variable EPS, we test H0 : = 0 against Ha : , 0. The test
statistic is:
7.445
= = = 6.508.
b
t b
se b 1.144
This is larger than t1/2,n2 = 2.0147 and therefore we reject the null. There is evidence to
support the fact that the EPS variable is a significant predictor of stock price.
7. To test H0 : = 24 against Ha : > 24, the test statistic is given by:
b 0 7.445 24
=
t b = = 14.47.
se b 1.144
Thus, since this test statistic is smaller than t1,n2 = t0.95,46 = 1.676, do not reject the null
hypothesis.
Solution 4.7: [wk10Q7, Exercise, Schedule] The grand total/sum is x = 2479 + 2619 + 2441 +
P
2677 = 10216 so that the grand mean is x = 10216/40 = 255.4. Also, x2 = 617163 + 687467 +
P
597607 + 718973 = 2621210. Therefore the total sum of squares is:
X 2 X 2
SST = xx = x2 N x
= 2621210 (40)(255.4)2 = 12043.6.
The sum of squares between the regions is:

X 2
SSM = ni xi. x

= 10 (247.9 255.4)2 + (261.9 255.4)2 + (244.1 255.4)2 + (267.7 255.4)2
= 3774.8.
The difference gives the sum of squares within the regions:
SSE = SST SSM = 12043.6 3774.8 = 8268.8.
The one-way ANOVA table is then summarised below:
ANOVA Table for the One-Way Layout

Source d.f. Sum of Squares Mean Square F-Statistic
Between 3 3774.8 1258.27 1258.27
229.69
= 5.478
Within 36 8268.8 229.69
Total 39 12043.6
Thus, to test the equality of the mean premiums across the regions, we test:
H0 : A = B = C = D = 0 all variances are equal
148
against the alternative:

Ha : at least one is not zero all variances are equal
using the F-test. Since F = 5.478 > F0.95 (3, 36) = 2.9 (approximately), we therefore reject H0 . There
is evidence to support a difference in the mean premiums across regions. The one-way ANOVA model
assumptions are as follows: each random variable xi j is observed according to the model
xi j = + i + i j , for i = 1, . . . , I, and j = 1, 2, . . . , ni
where i j refers to the random error in the jth observation of the ith treatment which satisfies:
h i
- E i j = 0 and Var i j = 2 for all i, j.
- The i j are independent and normally distributed (normal errors), and where is the overall
mean and i is the effect of the ith treatment with:
I
X
i = 0.
i=1
Solution 4.8: [wk10Q8, Exercise, Schedule] Consider the one-way ANOVA model:
Yi j = + i + i j , for i = 1, . . . , I and j = 1, . . . , J.
where the error terms i j are i.i.d. normal random variables with mean 0 and common variance 2 .
Since

Yi j N + i , 2 ,
then the likelihood function is given by:

I X J

!N 2
1 1 X yi j i
L yi j ; , i , 2 = exp (

2 i=1 j=1

2
where N = I J is the grand total sample size. Now, take the log-likelihood and differentiate with
respect to each parameter:
1 X X yi j i 2
I J
N
log L = ` yi j ; , i , = log (2) N log
2
2 2 i=1 j=1
and
I J
` 1 XX
= yij i
2 i=1 j=1
I J I

1
X X X
= = 0

y i j I J J i
2 i=1 j=1

i=1
J
` 1 X
= 2 yk j k = 0, for k = 1, 2, . . . , I
k j=1
2
` N XI X J
i j
y i

= + = 0.

i=1 j=1
3
149
i = 0 which is a standard assumption in the one-way ANOVA model, we have from

PI
Assuming i=1
the first equation:
I J
1 XX
=
b yi j = y.
I J i=1 j=1
From the second equation, we have:

J
1X
k =
b yk j y = yk. y
J j=1
and from the last equation, we have the MLE for the variance of the error term:
I J
1 XX 2

b 2
= yi j y yi. + y
I J i=1 j=1
I J
1 XX 2
= yi j yi. .
I J i=1 j=1
Solution 4.9: [wk10Q9, Exercise, Schedule] For the one-way ANOVA model we have Yi j N( +
i , 2 ) hence
1 yi j ( + i ) 2
!
1
f (yi j ; , i , ) = exp .
2 2
The likelihood function can be written as:
I
!Pi=1 !2
1
ni 1 X ni
I X
y i j ( + i )
L(yi j ; , i , ) =

exp

2 2 i=1 j=1
and the log-likelihood function is:

!2
I ni
+
I I X
1 X X 1 X yi j ( i )
l(yi j ; , i , ) = ni log(2) ni log() + .

2 i=1 i=1
2 i=1 j=1

Taking the partial derivative of l w.r.t. to and equating to 0:
l
I ni
yi j ( + i )
! !
1 XX 1
= 2 =0
2 i=1 j=1
ni
I X
X
yi j ( + i ) = 0
i=1 j=1
X ni
I X I
X I
X
yi j ni + ni i = 0
i=1 j=1 i=1 i=1
| {z }
0
ni
I X
X
yi j N = 0
i=1 j=1
ni
I P
P
yi j
i=1 j=1
=
b
N
150
Taking the partial derivative of l w.r.t. to i and equating to 0:
l
ni
yi j ( + i )
! !
1 X 1
= 2 =0
i 2 j=1
ni
X
yi j ( + i ) = 0
j=1
Xni
yi j ni ni i = 0
j=1
ni
X
ni i = yi j ni
j=1
ni
P
yi j
j=1

bi =
b
ni
1. We have the estimated correlation coefficient:

sms
r=
smm s ss
P
ms nms
=q
( m2 nm2 ) ( s2 ns2 )
P P
221, 022.58 1136.1 1934.2/10

=p = 0.764.
(129, 853.03 1136.12 /10) (377, 700.62 1934.22 /10)
i) We hypothesis is:
H0 : = 0 v.s. H1 : > 0
ii) The test statistic is:

r n2
T= tn2
1 r2
iii) The critical region is given by:
C = {(X1 , . . . , Xn ) : T (tn2,1 , )})
iv) The value of the test is:

r n 2 0.764 10 2
T= = = 3.35
1 r2 1 0.7642
v) We have t8,10.005 = 3.35. Thus the p-value is 0.005 and we reject the null hypothesis of a
zero correlation for level of significance less than 0.005 (usually it is larger, thus then we reject
the null).
2. Given the issue of whether mortality can be used to predict sickness, we require a plot of
sickness against mortality:
151
230
220
210
Sickness (s)
200
190
180
170
160
100 105 110 115 120 125 130
Mortality (m)
There seems to be an increase linear relationship such that mortality could be used to predict
sickness.
3. We have the estimates:

P
s ms ms nms
=
b = P
smm m2 nm2
221, 022.58 1136.1 1934.2/10
= = 1.6371
129, 853.03 1136.12 /10
1934.2 1136.1
=y b
b x = 1.6371 = 7.426
10 10
n
s2ms
!
1 X 1
=
c 2 yi ) =
(yi b 2
s ss
n 2 i=1 n2 smm
( ms nms)2
P !
1 X 2
= ( 2
s ns ) P
8 ( m2 nm2 )
(1278.118)2
!
1
= 3587.656 = 186.902
8 780.709
) =b
Var(b 2 /smm = 186.902/780.709 = 0.2394
i) Hypothesis:
H0 : = 2 v.s. H1 : < 2
ii) Test statistic:

T= q
b
tn2
/s xx
c2
iii) Critical region:
C = {(X1 , . . . , Xn ) : T (, tn2,1 )}
iv) Value of statistic:
1.6371 2
T= q = = 0.74
b
0.2394
/s xx
c2
152
v) We have from Formulae and Tables page 163: t8,10.25 = 0.7064 and t8,10.20 = 0.8889. Thus
the p-value (using symmetry) is between 0.2 and 0.25. Thus, we accept the null hypothesis if
the level of significance is smaller than the p-value (which is usually the case). Note: exact
p-value using computer package is 0.2402.
4. For a region with m = 115 we have the estimated value:
s = 7.426 + 1.6371 115 = 195.69
b
with corresponding variance:
1 (x0 x)2 (115 113.61)2
! !
1

c2 + = 186.902 + = 19.1528
n smm 10 780.709
The corresponding
95% confidence limits are 195.69 t8,10.025 s.e.(s|m = 115) = 195.69
2.306 19.1528 = 185.60 and 195.69+t8,10.025 s.e.(s|m = 115) = 195.69+2.306 19.1528 =
205.78.
1. (a) Points are shown in the scatterplot.
40
Number of deaths (n )
i
30
20
10
0
0 2 4 6 8 10 12
Quarter (i)
(b) The mean number of deaths increases with an increasing rate with quarter. The variance
also appears to increase with quarter.
2. (a) We have q = 12 i=1 (ni i ) . Take the derivative of q with respect to and equate that
2 2
P
equal to zero we obtain:
12
q X
=2 i2 (ni i2 ) = 0
i=1
12
X 12
X
ni i =
2
i4
i=1 i=1
P12 2
ni i
Pi=1
12
=b
.
i=1 i4
2 q
To prove that it is a minimum, we need to prove that 2
> 0:
12
2 q X
= 2 i4 > 0.
2 i=1
153
P (ni i2 )2 P12
(b) We have q = 12 i=1 i2
= i=1 (ni /i i)2 . Take the derivative of q with respect to
and equate that equal to zero we obtain:
12
q X
=2 i(ni /i i) = 0
i=1
12
X 12
X
ni = i2
i=1 i=1
P12
ni
Pi=1
12 2
=e
.
i=1 i
2 q
To prove that it is a minimum, we need to prove that 2
> 0:
12
2 q X
= 2 i2 > 0.
2 i=1
(c) We have:
P12
ni i2 15694
= Pi=1
b 12 4
= = 0.259
i=1 i
60710
P12
ni 174
= Pi=1
e 12 2
= = 0.268
i=1 i
650
3. (a) We have E[Ni ] = i . If we take the logarithm on both sides we obtain:
log(E[Ni ]) = log() + log(i)
Thus = log() and = .

= 1.6008 and s.e.(b
(b) It is given that b ) = 0.2525. i) Hypothesis:
H0 : = 2 v.s. H1 : , 2
ii) Test statistic:

T=
b
tn2
)
s.e.(b
C = {(X1 . . . , Xn ) : T {(, tn2,1/2 ) (tn2,1/2 , )}}
iv) Value of the test:
1.6008 2
T= = = 1.58
b
)
s.e.(b 0.2525
v) From formulae and Tables page 163 we obtain t10,10.10 = 1.372 and t10,10.05 = 1.812.
Thus the p-value of the hypothesis is between 0.1 and 0.2 (two-sided test!). For level of
significance lower than 0.1 we will accept the null hypothesis that = = 2 and thus this
assumption seems appropriate. Note: exact p-value using computer package is 0.1452.
154
1. (a) We have:
X X 2
SST = y2 y /n = 70.8744 29.122 /16 = 17.8760
X X
x =4 (1 + 2 + 3 + 4) = 40 x2 = 4 (12 + 22 + 32 + 42 ) = 120
X
xy =1 2.73 + 2 6.26 + 3 9.22 + 4 10.91 = 86.55
X X X
s xy = xy x y/n = 86.55 40 29.12/16 = 13.75
!2
13.75
SSM = = 1 s xx =
b2
20 = 9.453125
20
SSE =SST SSM = 17.8760 9.453125 = 8.422875.
(b)
s xy 13.75
=
b = = 0.6875
s xx 20
=y b
b x = 29.12 0.6875 40/16 = 0.1012
y=b
Thus, the fitted model is given by b +bx = 0.1012 + 0.6875x.
For x = 1 we have: by=b +bx = 0.1012 + 0.6875 1 = 0.7887
For x = 4 we have: by=b +bx = 0.1012 + 0.6875 4 = 2.8512
q
(c) We have s.e.(b ) = 8.4229/14
20
= 0.1734.
i) Hypothesis:
H0 : = 0 v.s. H1 : , 0
ii) Test statistic:

T=
b
tn2
)
s.e.(b
C = {(X1 , . . . , Xn ) : T {(, tn2,1/2 ) (tn2,1/2 , )}}
iv) Value of statistic:
0.6875 0
T= = = 3.965
b
)
s.e.(b 0.1734
v) We have t14,10.001 = 3.787 and t14,10.0005 = 4.140. Thus the p-value is between 0.1%
and 0.2%. Accept the null hypothesis if the level of significance is lower than the p-value
(which is usually not the case). Hence, we have strong evidence against the no linear
relationship hypothesis. Note: exact p-value using computer package is 0.00070481.
2. (a) We have:
SST =17.8760
SSB =(2.732 + 6.262 + 9.222 + 10.912 )/4 29.122 /16 = 9.6709
SSR =SST SSB = 17.8760 9.6709 = 8.2051
155
(b)
=29.12/16 = 1.82
b
1 =2.73/4 1.82 = 1.1375
b
2 =6.26/4 1.82 = 0.255
b
3 =9.22/4 1.82 = 0.485
b
4 =10.91/4 1.82 = 0.9075
b
(c) Company A: fitted value = 2.73/4 = 0.6825

Company D: fitted value = 10.91/4 = 2.7275
(d) Observed F statistic is (9.6709/3)/(8.2051/12) = 4.715 on (3,12) d.f..
(e) From Formulae and Tables page 173 and 174 we observe that F3,12 (4.474) = 2.5% and
F3,12 (5.953) = 1%. Thus the p-value is between 0.025 and 0.01, so we have some evidence
against the no company effects hypothesis. Note: exact p-value using computer package
is 0.0213.
Solution 4.13: [wk11Q1, Exercise, Schedule] To find the least squares estimate of , we minimize
the sum of squares:
n
X n
X
S S () = 2k = (Yk xk )2 .
k=1 k=1
Taking the first partial derivative and setting it to zero, we have:

n
S S () X
= 2 (Yk xk ) xk = 0
k=1
which gives
Pn
xk Yk
= Pk=1
b
n 2
.
k=1 xk
For the quadratic regression, we simply replace xk by xk2 to get the least squares estimate:
Pn
k=1 xk2 Yk
= Pn
b .
k=1 xk4
Solution 4.14: [wk11Q2, Exercise, Schedule] To establish a relationship between the coefficient of
determination and the correlation coefficient:
1. Starting with the LHS, we have
yk y =
b +b
b xk y
= y bx + bxk y
= (xk x) .
b
156
2. Using the result in part (a), we have

n
X n h
X i2
SSM = yk y = (xk x)
2 b
b
k=1 k=1
n
X
= b
2 (xk x)2
|k=1 {z }
=S 2x (n1)
3. Using part (b) now, we have
2 S 2x (n 1) b2 S 2x S y 2 S 2x
!
SSM b
R =
2
= 2 = 2 = r = r2 .
SST S y (n 1) Sy S x S y2
Thus, the coefficient of determination, as shown above, is nothing but the square of the correla-
tion coefficient for simple linear regressions.
Solution 4.15: [wk11Q3, Exercise, Schedule] Consider the linear regression model Yk = +xk +k :
SSM SSE (n 2) MSE (n 2) s2

1. R2 = =1 =1 = 1 .
SST SST (n 1) S y2 (n 1) S y2
2. From (a) we solve for s2 to have

(n 1) S y2
s2 = 1 R2
n2
(n 1) S y2
= 1 r2
n2
and since s is non-negative, take the positive square root of both sides of previous equation and
the result immediately follows.
3. Beginning with the t-ratio statistic:

S
r S yx
= =
b
t b
s
se b S x n1
rS y
= q
S y 1 r2 n1 /

n2
n1
r
r n2 r2
= = n2
1 r2 1 r2
provided r 0 in this case.
1.

1 x1
1 x2

X = [1n x] = .. ..

. .
1 xn

157
2.

1 x1
#
. . . 1 1 x2
" " Pn # " #
1 1 n x 1 x
X X=
>
.. .. = Pn x Pn x2 = n x
i=1 i
. . . xn 1 Pn
x1 x2 . . xi2

i=1 i i=1 i n i=1

1 xn

3.

y1
#
. . . 1 y2
" " Pn #
1 1 y
X Y=
>
.. = Pni=1
i

x1 x2 . . . xn . i=1 xi yi

yn

4. Note: the inverse of a 2 2 matrix is given by:

" #1 " # " #
a b 1 d b 1 d b
M 1
= = =
c d det(M) c a ad bc c a
Using this and the result from (b) we have:

" Pn 2 # " 1 Pn #
1 i=1 xi nx 1 xi2 x
(X X) = Pn 2
> 1
= n i=1
n i=1 xi n2 x2 nx n s xx x 1
5. Using the result of (c) and (d) we have:

" 1 Pn 2 # " Pn #
1 i=1 xi x i=1 yi
=(X X) X Y =
b > 1 >
n P n
s xx x 1 i=1 xi yi
" Pn 1 Pn 2 Pn # " Pn 2 Pn #
1 yi n i=1 xi i=1 xi yi x y i=1 xi i=1 xi yi x
= i=1 P
=
ni=1 yi x + ni=1 xi yi 1
P Pn
s xx i=1 xi yi nxy
" Pn 2 2
Pn 2
#
y i=1 xi nx i=1 xi yi x nx y
= Pn
i=1 xi yi nxy
" Pn 2 s
y i=1 xi nx2 x ni=1 xi yi nxy y sxxxy x
P # " #
= Pn = s xy
i=1 xi yi nxy s xx
Solution 4.17: [wk11Q5, Exercise, Schedule] Statement (E) is correct.

Note that statement (A) is incorrect because, if food sales increases with one, the expected profit
1 10 (note the difference in the scale of profit (thousands) and food sales (in ten
increases with b
thousands). Similarly (C) and (D) are incorrect.
Solution 4.18: [wk11Q6, Exercise, Schedule] Statement (D) is correct.

We have n = 25 observations, p = 3+1 = 4 parameters (three explanatory variables and the constant),
SST= 666.98, and SSM= 610.48. Thus we have:
SSE =SST SSM = 666.98 610.48 = 56.5

SSE/(n p) 56.5/(25 4) 56.5/21
R2a =1 =1 =1 = 0.903
SST/(n 1) 666.98/(25 1) 666.98/24
158
SSM SST SSE

R2 = = I and II correct
SST SST
SSM SSM SSM
= = , because SSM> 0, III incorrect
SST SSM + SSE SSE
* using definition of R2 and ** using SST=SSM+SSE.
Solution 4.20: [wk11Q8, Exercise, Schedule] Statement (B) is correct.
SSE/(n p) 8525.3/(47 5) 8525.3/42

R2a =1 =1 =1 = 0.5634
SST/(n 1) 21851.4/(47 1) 21851.4/46

Let C = (X > X)1 and c33 the third diagonal element of the matrix C. We have:
p
s.e. b2 = c33 s2 = 0.0102446 20106 = 55.535928
Solution 4.22: [wk11Q10, Exercise, Schedule] Statement (C) is correct.

We have:
= (X > X)1 X > Y
b
In order to find the estimate of the parameter related to x3 (having graduated from college) we need
the fourth (note 1 corresponds to the constant) row of the matrix (X > X)1 and multiply that with the
vector X > Y. We have:

9, 558
h i 4, 880, 937
b3 = 0.026804 0.000091 0.023971 0.083184 = 21.953
7, 396
6, 552
Note that y is in hundreds of dollars, so having a graduated from college leads to 21.953 100 =
2, 195.3 on the amount paid for a car.
Solution 4.23: [wk11Q11, Exercise, Schedule] Statement (A) is correct.

k for k = 1, . . . , p is given by:
We have that the distribution of b
k k
b
tnp
k
s.e. b
We have p = 5, and n = 212. Note, n p is large, thus the standard normal approximation for the
student-t is appropriate (Formulae and Tables page 163 only shows a table for degrees of freedom
up to 120 and = standard normal). We have z10.05/2 = 1.96. This provides the well-known rule
of thumb that the absolute value of the T value should be larger than 2 for parameter estimates to be
significant (|T | > 2). This is the case for all parameters.
Solution 4.24: [wk11Q12, Exercise, Schedule] Statement (E) is correct.

The unbiased estimator s2 of 2 is given by:
1 >
s2 = Y bY Y b Y
n p
1 >
= Y Xb
Y Xb
n p
159
LIFE EXP = 48.24 + 0.79 GNP + 0.154 URBAN%

= 48.24 + 0.79 3 + 0.154 60
= 59.85
Solution 4.26: [wk11Q14, Exercise, Schedule] Statement (C) is correct.

(A) Can be done by the scatterplot, but a QQ-plot is better.
(B) Can be done by the scatterplot, but R2 is better method.
(C) Is the correct one, need both the errors and the corresponding value of the endogenous variable.
(D) Should be by definition by selecting the LS estimator, so does not need to be tested.
(E) Errors should be independent of X not Y.
1. We have:
h i h i
|X = x =E (X > X)1 X > Y|X = x
E b
=(X > X)1 X > E Y|X = x

=(X > X)1 X > X

=
2. We have:

|X = x =Var (X > X)1 X > Y|X = x
Var b

=(X > X)1 X > Var Y|X = x X(X > X)1

=(X > X)1 X > 2 In X(X > X)1

=2 (X > X)1 X > X(X > X)1
=2 (X > X)1
* Let A = (X > X)1 X > be a matrix and Y a vector of random variables, we have Var(AY) =
> >
AVar(Y)A> , we have A> = (X > X)1 X > = X > > (X > X)1 = X (X > X)1 using symmetry
of the matrix (X>X)1 and for matrices A and B we have (AB)> = B> A> . ** using 2 is a
constant and X > In X = X > X.
1. The matrix H = X(X > X)1 X > is a n n matrix which is often called the hat-matrix, because
when we pre-multiply the vector Y with this matrix (i.e., HY) one obtains the estimated values
of b
Y.
2. We have:
>
H > H =H 2 = X(X > X)1 X > X(X > X)1 X >
> >
= X > (X > X)1 X > X(X > X)1 X >

=X(X > X)1 X > X(X > X)1 X >
=X(X > X)1 X >
=H
160
* see 15b.
3. We have:

h11 h12 h1n y1
h21 h22 h2n y2

Y =X = HY = .. .. . . . .. ..
b b
. . . .

hn1 hn2 hnn yn

The ith value of the vector b yi and is equal to the ith row of the matrix H multiplied with the
Y is b
vector Y. Thus we have:
n
X X
yi =
b hi j y j = hii yi + hi j y j
j=1 j,i
4. In question 4(d) we proved:

" Pn # " 1 Pn #
1 xi2 nx 1 xi2 x
>
(X X) 1
= Pn 2 i=1
= n i=1
n i=1 xi n2 x nx n s xx x 1
Using this and H = X(X > X)1 X > we have:

" 1 Pn 2 #
1 i=1 xi x
H =X(X X) X = X > 1 >
n X>
s xx x 1

1 x1
1 x2 1 " 1 Pn x2 x # " 1 1 1 #
= .. .. n i=1 i
. . s xx x 1 x1 x2 xn
1 xn

1 x1
1 1 x2 n1 ni=1 xi2 xx1 1n ni=1 xi2 xx2 1n ni=1 xi2 xxn
" P P P #
= . .
s xx .. .. x + x1 x + x2 x + xn
1 xn

n i=1 xi xx1 + (x1 x)x1 n1 ni=1 xi2 xxn + (xn x)x1

1 Pn 2 P
1 n i=1 xi xx1 + (x1 x)x2 n i=1 xi xxn + (xn x)x2

1 n 2
P 1 P n 2
= .. .. ..
. . .

s xx
i=1 xi xx1 + (x1 x)xn n i=1 xi xxn + (xn x)xn
1 Pn 2 1 Pn 2
n
The (i, j)th element of H is given by:

n
1 1 X 2
hi, j = xk xx j + (x j x)xi

s xx n k=1
n
1 1 X 2
= xk nx + x xx j + (x j x)xi
2
2

s xxn k=1
1 1 2
= + x xx j + (x j x)xi
n s xx
1 x xx j + xi x j xxi
2
= +
n s xx
1 (xi x)(x j x)
= +
n s xx
161
5. We have:
e =Y b
b Y = Y HY = (In H)Y
where In is a n n identity matrix.
6. Using the result in part (e) we have:

h i
e|X = x =E (In H)Y|X = x

E b
=(In H)E Y|X = x

=(In H)X
=X HX
=X X(X > X)1 X > X
=X X
=0
where 0 is a vector of size n with all elements equal to zero.
7. Using the result in part (e) we have:

e|X = x =Var (In H)Y|X = x

Var b
=(In H)Var Y|X = x (In H)>

=(In H)2 In (In H)>

=2 (In H)(In> H > )
=2 (In In> In H > HIn> + HH > )
=2 (In In In H > HIn + HH > )

=2 (In H H + H)
=2 (In H)
* using HH > = (H > H)> = (H)> = H where the second equation is derived in question (b) and
the third equation sign is due to symmetry of the matrix H, which can be observed from the
results in question (d).
8. From our results above, the ith least squares residual has variance given by:
ei |X = x = 2 (1 hii ),

Var b
where hii is the ith diagonal element of H. Standardizing b

ei means that we have to subtract its
mean (which equals zero) and divide by the standard deviation (sample standard deviation in
this case). The estimate of the variance is given by: 2 = s2 = np 1 Pn
e2i . Therefore, the
i=1 b

ei /(s 1 hii ).
standardized residual is given by: b
162

2017 ACTL2131 Exercises

Загружено:

Сведения о документе

Исходное описание:

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

2017 ACTL2131 Exercises

Загружено:

Авторское право:

Доступные форматы

UNSW Business School

School of Risk and Actuarial Studies

Schedule of Tutorial Exercises 1

Exercise 1.24 [wk02Q7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Exercise 1.11 [wk01Q11] . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

Exercise 2.6 [wk05Q13] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

Exercise 3.12 [wk08Q7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

4 Linear Regression 122

Exercise 4.6 [wk10Q6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

Exercises Before Tutorial During Tutorial After Tutorial Additional

1. Describe the sample space for this experiment.

2. Describe a -algebra for this experiment.

1. all 3 balls are red;

1. If A and B are mutually exclusive, then they cannot be independent.

2. If A and B are independent, then they cannot be mutually exclusive.

E2 = {the sum is between (and includes) 7 and 10}

1. Show that E1 , E2 and E3 are independent.

2. Show that E1 and E2 are not pairwise independent.

3. Show that E2 and E3 are not pairwise independent.

4. What about E1 and E3 are they pairwise independent?

2. Evaluate student As reasoning, i.e., explain whether his reasoning is justified.

1. four flights can be eliminated.

4. two flights can be eliminated from each airline.

1.1 Mathematical Methods

fX (x) = cex , x > 1, and zero otherwise.

1. all c such that f x is a random variable, and

1. Specify the probability mass function pX (x).

1. Find an expression for the moment generating function, MX (t) of X.

E (X) = and Var (X) = 2 .

4. How do we call the function S (t)?

2. Determine the first five non-central moments of X.

3. Determine the first five central moments of X.

4. Determine the mean, variance, skewness, and kurtosis of X.

i) As parameters are: = 1, = 2, = 1, and = 1 and B1 parameters are = 1,

1. Determine a formula for the cumulative distribution function F X (x).

2. Determine the probability that X 4.

3. Sketch the graphs of fX (x) and F X (x).

1. Verify that fX () is a pdf.

2. Find expression for the cdf F X (x).

4. Suppose = 1. Evaluate Pr (|X| < 3/4).

1. Using this definition, prove that:

2. Show that for a non-negative random variable:

Use this result to show that:

where a and b are positive constants.

1. Determine the value of a in terms of b and show that b 1.

2. For the case b = 1, determine the mean and variance of X.

1.2 Univariate Distributions

5. A set of 25 multiple-choice questions was asked in an examination. It has been determined,

1. Prove that limn MX (t) = MY (t).

Second, a similar relation can be approximated for the binomial distribution:

Hint: first show that Pr(X=x)

1. Determine the probability of getting no claims over $100 in a given day.

Exercise 1.22: [wk02Q5, Solution, Schedule] Let X have a Gamma(, ) distribution.

1. Prove that the mgf of X can be expressed as:

2. Establish also that for any positive constant r

RA Normal (0.05, 0.1) and RB Normal (0.10, 0.5) .

1. Find expressions for the mean and variance of X.

 quantile of X. The quantile function is f (u) = F X (u) hence, one should

3. What is then its median (i.e., u = 0.5)?

(a) the mean loss amount;

1.3 Joint and Multivariate Distributions

1. Find the marginal probability mass functions of X and Y.

1. By integrating over the appropriate regions, find:

quantile of X. The quantile function is f (u) = F X (u) hence, one should