Вы находитесь на странице: 1из 9

ESGC6121

Survey Techniques and Sampling Design - Test 2


14 December 2012

1a. Would you select a two-stage or a single-stage cluster sample, if each


of the 400 enumeration blocks has around 500 houses? Why?
(2 marks)
Two stage because elements within the EB tend to be homogenous. Given a
fixed sample size and budget, one should cover as many EBs as possible to
reduce sampling errors.
1b. What are main differences between stratified random sampling and
cluster sampling?
Stratified random sampling involves few groups, and elements must be
selected from each group (strata). The variance/sampling error tend to be
smaller than SRS because elements are homogenous. Cluster sampling
involves selection of clusters from a large number of clusters which tend to
be homogeneous within. Sin selecting clusters that are homogenous, one
may miss out clusters that have different characteristics, and hence the
sampling error is generally larger than SRS. Single stage cluster sampling
covers all elements in the selected clusters. However, stratification would
require selection of SRS or systematic samples from each stratum.
(2 marks)

1c. Which of the following sampling schemes will produce a more precise
estimate? Why?
i) n=20 and m=50, or ii) n=50 and m=20; where n is the number of
cluster and m is the number of elements)
(2 marks)
ii ) is better because it is spread out, and it reduces the rate of homogeneity.
The chances of covering clusters with different characteristics are better as
compared to design i).

2. The director of University Hospital wants to find out the average waiting
time at Clinic G. He intends to select a systematic sample of 60 patients
out of the total of 600 registered on a particular day. The result shows the
mean waiting time y =110 minutes, and standard deviation s = 20
minutes. Find the finite population correction factor and 95% confidence
interval for the mean waiting time.
(3 marks)

s 2 N n
V ( y )

n N
FPC=540/599 =0.9015
N
n
y
s
s sq
FPC

600
60
110
20
400
0.901503

Var (y bar)
SE
ME
95% Ci

6.010017
2.451534
4.903067
105.0969

114.9031

3. The bank manager wanted to estimate the total amount of cash withdrawn
from ATM machines located in shopping complexes and other places. He
selected a random sample of ATM machines from 100 machines located in
shopping complex and 200 machines located in other places. Results of his
study are shown in Table 1.
Table 1
N
n

Shopping complex
100
20

Other places
200
40

yi
i

320
60

168
20

a) Did he use equal, proportional or Neyman allocation in selecting the


sample?
(1 mark)
Proportional

b) Find y st and place a bound on the error of estimation (95%).


(3 marks)

1
N2

si2

N i ( N i ni )

i 1
ni
L

N
n
yi
si
s sq
wi
N-n
Y bar

Shopping complex
100
20

Other places
200
40

320

168

60
3600
0.3333
80

20
400
0.6667
160

106.656

112.0056

1440000

Var (y bar)
SE
ME

320000

218.66
16
176000
0
19.555
56
4.4221
66
8.8443
33

c)

Determi
ne the
sample
size
using
Neyman
allocatio
n, if the
margin
of error
B=10.
(3
marks)

N
i 1

2
i

2
i

/ ai

N 2 D N i i2
i 1

N
s sq
Ni Si

Shoppi
ng
comple
x
100
3600

Other
places
200
400

6000

4000

10000

wi
D
Num
Nsq D
N Sq
Den

0.6
25

0.4

600000
00
225000
0
360000

400000
00

1000000
00

80000

440000
2690000
37.1747
21

n
round
up

38

4. Use Melaka sample data to answer this question. The distribution of


households in Melaka is as follow: Alor Gajah (21.1%), Jasin (16.4%),
Melaka Tengah (62.5%). Create weight by district. Using weighted and
unweighted data, compare i) the ethnic distribution; ii) the proportion of
households in Melaka owing air-condition. (Hint: Run frequency
command /descriptive to get the distribution/ proportion (mean).
(4 marks)
Unweighted
Ethnic group
Frequency

Percent

Valid Percent

Cumulative
Percent

Valid

Malays

201

67.0

67.0

67.0

.3

.3

67.3

Chinese

72

24.0

24.0

91.3

Indians

11

3.7

3.7

95.0

Other Bumiputera

Others
Non-citizens
Total

.3

.3

95.3

14

4.7

4.7

100.0

300

100.0

100.0

Weighted

Ethnic group
Frequency

Percent

Valid Percent

Cumulative
Percent

Malays

176

58.5

58.5

58.5

.6

.6

59.1

Chinese

99

32.8

32.8

92.0

Indians

2.0

2.0

94.0

Others

.2

.2

94.2

17

5.8

5.8

100.0

300

100.0

100.0

Other Bumiputera

Valid

Non-citizens
Total

Unweighted prop with aircon 0.09


Weighted =0.14

5. A durian seller chose 3 baskets out of 10 baskets to find out the


proportion of rotten durians. Results are shown in Table 2. Estimate the
proportion of rotten durian and place a bound on the error of estimation (at
95% confidence level)
(Note: the total number of durian M is unknown).
(4 marks)
Basket
A
B
C

Table 2
M (number of
durians)
50
40
35

A (number of rotten
durians)
5
8
5

N n
V ( p ) 2 s 2p
NnM
n

s 2p

a
i 1

Baske
t

2
p mi

n 1
M (number
of durians)

A (number of
rotten durians)

mi

ai

A
B
C

sq

50
40
35
125

P
M bar
FPC
Var(p)

a-m*P

5
8
5

-2.2
2.24
-0.04

4.84
5.0176
0.0016
9.8592
4.9296

18
0.144 sp sq

41.666667
0.0001344
0.000663

SE

0.0257398

ME

0.0514796

6. A public health researcher decided to use two stage cluster sampling to


select 400 students from a total of 24,000 students to study the proportion
of students who are obese in the district. In the first stage, he selected 4
schools out of 12 schools, and in the second stage he selected 100 students
from each of the 4 four selected schools.
Table 3
School
A
B

Number of students
2650
1290

C
D
E
F
G
H
I
J
K
L

1318
2010
2220
2775
1505
2893
1305
3390
1622
1022
24000

a) Please refer to Table 3 and help him select the sample using systematic
sampling with probability proportional to size. (Let 1039 be the random
start).
(2 marks)
b) What is the first stage, second stage and overall sampling fraction for
each of the 4 selected schools?
(2 marks)
c) If the proportion of students that are obese in each of the selected schools
is 0.24. 0.16, 0.35, 0.19. Estimate the proportion of obese students in the
district, and place a bound on the error of estimation.
(2 marks)

Numb
Schoo er of
l
studen
ts
A

2650

B
C

1290
1318

1st
2nd
overall p
0.4416 0.0377 0.0166
2650
67
36
67
0.24
3940
5258

2010

E
F

2220
2775

1505

H
I

2893
1305

3390

K
L

1622
1022
24000

7268
9488
12263
13768
16661
17966
21356
22978
24000

0.335

0.0497
51

0.0166
67

0.16

0.2508
33

0.0664
45

0.0166
67

0.35

0.565

0.0294
99

0.0166
67

0.19

Interv
al
F
b

RS

0.235

6000
60

0.24

0.005

100

0.16

-0.075

0.35

0.115

0.19

-0.045

6000

1039
7039

6000

13039

Var
Ppps

6000

19039

SE
ME

END

sq
0.0000
25
0.0056
25
0.0132
25
0.0020
25
0.0209
0.0017
42
0.0417
33
0.0834
67

Вам также может понравиться