Вы находитесь на странице: 1из 11

SIMPLE RANDOM SAMPLING

4.1 Introduction
Definition: If a sample of size n is drawn from a population of size N such that
every possible sample of size n has the same chance of being selected, the
sampling procedure is called simple random sampling. The sample obtained is
called a simple random sample.

4.2 How to Draw a Simple Random Sampling


Example 4.1: For simplicity, assume there are N = 1000 patient records from
which a simple random sample n = 20 is to be drawn. We know that a simple
random sample will be obtained if every possible sample of n = 20 records has the
same chance of being selected. From a random number given, determine which
records are to be included in the sample.
4.3 Estimation of a Population Mean and Total
Example 4.2: Refer to the hospital audit in previous example and suppose that a
random sample of n = 200 accounts is selected from the total of N = 1000. The
sample mean of the accounts is found to be 𝑦̅ = 𝑅𝑀94.22, and the sample
variance is 445.21. Estimate 𝜇, the average due for all 1000 hospitals accounts,
and place a bound on the error of estimation.
Example 4.3: A simple random sample of n = 9 hospital records is drawn to
estimate the average amount of money due on N = 484 open accounts. The sample
values for these 9 records are listed in the following table. Estimate 𝜇, the average
amount outstanding, and place a bound on the error of estimation.
Table 4.3: Amount of money owed
𝑦𝑖 𝑦1 𝑦2 𝑦3 𝑦4 𝑦5 𝑦6 𝑦7 𝑦8 𝑦9
RM 33.50 32.00 52.00 43.00 40.00 41.00 45.00 42.50 39.00
Example 4.4: An industry firm is concerned about the time spent each week by
scientists on certain trivial tasks. The time-log sheets of a simple random sample
of n =50 employees show the average amount of time spent on these tasks is 10.31
hours, with a sample variance, 2.25. The company employs N =750 scientist.
Estimate the total number of worker-hours lost each week on trivial tasks and place
a bound on the error of estimation.
4.4 Selecting the Sample Size for Estimating Population Means and Total
Example 4.5: The average amount of money 𝜇 for a hospital’s accounts receivable
must be estimated. Although no prior data are available to estimate the population
variance, it is known that most accounts lie within a RM100.00 range. There are N
= 1000 open accounts. Find the sample size needed to estimate 𝜇 with a bound on
the error of estimation B = RM3.00.
Example 4.6: An investigator is interested in estimating the total weight gain in 4
weeks for N = 1000 chicks fed on a new ration. Obviously, to weigh each bird
would be time-consuming and tedious. Therefore, determine the number of chicks
to be sampled in this study in order to estimate 𝜏 with a bound on the error of
estimation B = 1000 grams. Many similar studies on chick nutrition have been run
in the past. Using data from these studies, the investigator found that 𝜎 2 , the
population variance, was approximately equal to 36 gram2. Determine the required
sample size.
4.5 Estimation of a Population Proportion
Example: A simple random sample of n =100 college seniors was selected to
estimate
1. The fraction of N = 300 seniors going to graduate school, and
2. The fraction of students that have held part-time jobs during college.
Let 𝑦𝑖 and 𝑥𝑖 (i = 1, 2, …, 100) denote the responses of the ith student sampled.
0, 𝑖𝑓 𝑡ℎ𝑒 𝑖 𝑡ℎ 𝑠𝑡𝑢𝑑𝑒𝑛𝑡 𝑑𝑜𝑒𝑠 𝑛𝑜𝑡 𝑝𝑙𝑎𝑛 𝑡𝑜 𝑎𝑡𝑡𝑒𝑛𝑑 𝑔𝑟𝑎𝑑𝑢𝑎𝑡𝑒 𝑠𝑐ℎ𝑜𝑜𝑙
𝑦𝑖 = {
1, 𝑖𝑓 𝑡ℎ𝑒 𝑖 𝑡ℎ 𝑠𝑡𝑢𝑑𝑒𝑛𝑡 𝑑𝑜𝑒𝑠 𝑝𝑙𝑎𝑛 𝑡𝑜 𝑎𝑡𝑡𝑒𝑛𝑑 𝑔𝑟𝑎𝑑𝑢𝑎𝑡𝑒 𝑠𝑐ℎ𝑜𝑜𝑙
Similarly, let
0, 𝑖𝑓 𝑡ℎ𝑒 𝑠𝑡𝑢𝑑𝑒𝑛𝑡 𝑑𝑜𝑒𝑠 𝑛𝑜𝑡 ℎ𝑒𝑙𝑑 𝑎 𝑝𝑎𝑟𝑡 𝑡𝑖𝑚𝑒 𝑗𝑜𝑏 𝑑𝑢𝑟𝑖𝑛𝑔 𝑐𝑜𝑙𝑙𝑒𝑔𝑒
𝑥𝑖 = {
1, 𝑖𝑓 𝑡ℎ𝑒 𝑠𝑡𝑢𝑑𝑒𝑛𝑡 𝑑𝑜𝑒𝑠 ℎ𝑒𝑙𝑑 𝑎 𝑝𝑎𝑟𝑡 𝑡𝑖𝑚𝑒 𝑗𝑜𝑏 𝑑𝑢𝑟𝑖𝑛𝑔 𝑐𝑜𝑙𝑙𝑒𝑔𝑒
Using the following data, estimate,
a) the proportion of seniors planning to attend graduate school,
b) the proportion of seniors who have had a part time job sometimes during
their college.
c) Find a 95% confidence interval of the true proportion of seniors planning to
attend graduate school, and the true proportion of seniors who have had a
part time job sometimes during their college.

Student y x
1 1 0
2 0 1
3 0 1
.
.
.
98 0 1
99 0 1
100 1 1
100 100

∑ 𝑦𝑖 = 15 ∑ 𝑥𝑖 = 65
𝑖=1 𝑖=1
Sample size required to estimate p with a bound on the error of estimation B
𝑁𝑝𝑞
𝑛=
(𝑁 − 1)𝐷 + 𝑝𝑞
𝐵2
where = .
4

Example: Student government leaders at a college want to conduct a survey to


determine the proportion of students who favor a proposed honor code. Because
interviewing N = 2000 students in a reasonable length of time is almost impossible,
determine the sample size (number of students to be interviewed) needed to
estimate p with a bound on the error of estimation of magnitude B = 0.05. Assume
that no prior information is available to estimate p.
Example: Refers to previous example. Suppose that, an addition to estimating the
proportion of students who favor the proposed honor code, student government
leaders also want to estimate the number of students who feel the student union
building adequately serves their needs. Determine the combine sample size
required for a survey to estimate
a) the proportion that favors the proposed honor code, p1 with B = 0.05.
b) the proportion that believes the student union adequately serves its need, p2
with B = 0.07.
Although no prior information is available to estimate , approximately 60% of the
students believed the union adequately met their needs in a similar survey run the
previous year.
4.6 Comparing Estimates
Example: Fish absorb mercury as water passes through their gills, and too much
mercury makes the fish unfit for human consumption. In 1994, the state of Maine
issued a health advisory warning that people should be careful about eating fish
from Maine lakes because of the high levels of mercury. Before the warning, data
(fish) on the status of Maine lakes were collected at random and their mercury
content was measured in parts per million (ppm). The following table shows a
selection of data from a random sample of 35 lakes.
Mercury (Hg) ppm Lake Type Dam 1 = yes; 0 = no
1.050 2 1
0.230 2 1
.
.
0.160 1 0
0.490 3 0

Type 1 lakes are oligotrophic (balanced between decaying vegetation and living
organism, Type 2 lakes are eutrophic (high decay rate and little oxygen), and Type
3 lakes are mesotrophic (between the other two states). The table also shows
whether the lake is formed behind a dam. The summary statistics are in the
following table:
Type Count Mean Median Standard deviation
1 4 0.22 0.20 0.103
2 15 0.74 0.68 0.583
3 16 0.50 0.44 0.272

a) Comparing lake Types 1 and 2, what is your best estimate of the difference in
the mean mercury levels for these 2 types of lakes?
b) Is there sufficient evidence to conclude that the mean mercury level for lakes
type2 differs from that for lakes of type 3?
Solution:
Notes: When comparing means, we consider only the independent sample case
because the dependent case becomes too complicated to handle at this level
For the two sample proportions arising from a multinomial sample of size n
𝐸(𝑝̂1 − 𝑝̂ 2 ) = 𝑝1 − 𝑝2
and
𝑉(𝑝̂1 − 𝑝̂ 2 ) = 𝑉(𝑝̂1 ) + 𝑉(𝑝̂2 ) − 2𝐶𝑜𝑣(𝑝̂1, 𝑝̂2 )

Example: The notion of banning smoking from the workplace has been around
from a long time. A time poll of 800 adults carried out on April 1994 asked:
‘Should smoking be banned from workplaces, should there be special smoking
area, or should there be no restrictions?’
The results are as follows:
Non-smokers (%) Smokers (%)
Banned 44 8
Special areas 52 80
No restrictions 3 11

Based on a sample of approximately 600 non-smokers and 200 smokers, estimate


a) the true difference between the proportions choosing ‘banned’.
b) the true difference between the proportions of non-smokers choosing
‘banned’ and ‘special areas’

Вам также может понравиться