Вы находитесь на странице: 1из 9

When we do have specific objective, and we want to collect the data and do the analysis

that will help us to meet that objective, we typically get our data from two common sources:
observational studies (such as polls) and experiments (such as using a treatment to
improve hair growth).
In an observational study, we observe and measure specific characteristics, but we dont
attempt to modify the subjects being studied.
In an experiment, we apply some treatment and then proceed to observed its effect on the
subjects.

SAMPLE SIZE
An important consideration in conducting research is the size of your sample. It must be
large enough so that erratic behavior of very small samples will not produce misleading
results. Repetition of a research or an experiment is called replication.
A large sample is not necessarily a good sample. Although it is important to have a sample
that is sufficiently large, it is more important to have a sample in which the elements have
been chosen in an appropriate way, such as random selection.
Use a sample size large enough so that we can see the true nature of any effects or
phenomena, and obtain the sample using an appropriate method, such as one based
on randomness.

Determining Sample Size (n)


Slovins Formula

Lynch et. al. Formula

N z2 p(1 p)

n = ------------1 + Ne2

n = -----------------------Nd2 + Z2 p(1 p)

n = minimum sample size

n = sample size

N = population size

N = population size

e = margin of error due to sampling

p = 0.50 (proportion of getting a good sample)

(0.05 or 0.025 or 0.10)

1 p = 0.50 (proportion of getting a poor sample)


d = 0.025 or 0.05 or 0.10
(your choice of sampling error)
Z = 1.96 ( 95% reliability in obtaining the sample size)
2.33 (99% reliability in obtaining the sample size)

Sample Size for Estimating Mean


Z

n = ---------E
Population (N) is not known

Population (N) is known

Z or Z/2 = 1.96 (95% degree of confidence)


2.33 (99% degree of confidence)
E = desired margin of error
= population standard deviation
(In case the population standard deviation is not known, use the sample standard deviation
s instead.)

Problem1:
To plan for the proper handling of household garbage, the city of Baguio must estimate the
mean weight of garbage discarded by households in one week. Find the sample size
necessary to estimate that mean if you want to be 95% confident that the sample mean is
within 2 lb of the true population mean. For the population standard deviation , use the value
12.46 lb, which is the standard deviation of the sample of 62 households included in the
Garbage Project study conducted at the University of Baguio.

Problem2:
A researcher wants to estimate the mean amount of time (in hours) that full-time college
students spend watching television each weekday. Find the sample size necessary to
estimate that mean with a 0.25 hr (15 minutes) margin of error. Assume that a 99% degree of
confidence is desired. Also assume that a pilot study showed that the standard deviation is
estimated to be 1.87 hr.

RANDOMIZATION
One of the worst mistakes is to collect data in a way that is inappropriate. We cannot
overstress this very important point:
Data carelessly collected may be so completely useless that no amount of statistical
torturing can salvage them.

COMMON METHODS OF SAMPLING


In a random sample members of the population are selected in such a way that each has an
equal chance of being selected.
(1) A simple random sample of size n subjects is selected in such a way that every possible
sample of size n has the same chance of being chosen or selected.
Fishbowl technique or Lottery technique
Table of Random Numbers or computer-generated random numbers
(2) In a systematic sampling, we select some starting point and then select every kth (such
as every 50th) element in the population.
Example: IF Coca Cola managers wanted to poll the 29,500 employees, they could begin with
complete employee roster, then select every 50 th person to obtain a sample of size 590.
This method is simple and is often used.

(3) With convenience sampling, we simply use results that are readily available.
Example: In some cases, results from convenience sampling may be quite good, but in
many other cases they may be seriously biased. In investigating the proportion of lefthanded students, it would be convenient for a student to survey his or her
classmates, because they are readily available. Even though such a sample is not
random, the results should be quite good.
(4) With stratified sampling, we subdivide the population into at least two different
subgroups (or strata) that share the same characteristics (such as gender or age
bracket), then we draw a sample from each stratum.
Example: Using the CAR region as strata, we might select a random sample of voters in
each province. If the different strata have sample sizes that are in the same
proportion as in the population, we say that we have proportionate sampling. If it
should happen that some strata are not represented in the proper proportion, then the
results can be adjusted or weighted accordingly. For a fixed sample size, if you
randomly select subjects from different strata, you are likely to get more consistent
(and less variable) results than by simply selecting a random sample from the general
population. For that reason, stratified sampling is often used to reduce the variation in
the results.

(5) In cluster sampling, we first divide the population area into sections (or clusters), then
randomly select some of those clusters and then choose all the members from those selected
clusters
Note: In stratified sampling and cluster sampling both involve the formation of subgroups, but
cluster sampling uses all members from a sample of clusters, whereas stratified sampling uses
a sample of members from all strata.
Example: Cluster sampling can be found in a pre-election poll, in which we randomly select 30
election precincts and survey all the people from each of those precincts. This would be much
faster and much less expensive than selecting one person from each of the many precincts in
the population area. The results can be adjusted or weighted to correct for any disproportionate
representations of groups. Cluster sampling is used extensively by government and private
research organizations.

A sampling error is the difference between a sample result and the true population result;
such an error results from chance sample fluctuations.
A nonsampling error occurs when the sample data are incorrectly collected, recorded, or
analyzed (such as by selecting a biased sample, using a defective measurement instrument, or
copying the data incorrectly).

Random Sampling Each


member of the population has
an equal chance of being
selected. Computers are often
used to generate random
numbers

Systematic Sampling Select every kth member

Convenience Sampling Use results that


are readily available

Stratified Sampling
Classify the population
into at least two strata,
then draw a sample from
each.

Cluster Sampling Divide the


population area into sections, randomly
select a few of those sections, and then
choose all members in them