Вы находитесь на странице: 1из 44

SAMPLING STRATEGIES

1
SAMPLING STRATEGIES
 Sampling in business research is generally conducted
in order to permit the detailed study of part, rather
than the whole, of a population.

 The information derived from the resulting sample is


customarily employed to develop useful
generalizations about the population.

 These generalizations may be in the form of estimates


of one or more characteristics associated with the
population, or they may be concerned with estimates
of the strength of relationships between
characteristics within the population.
2
Why select a sample of the target
population?
It is not logistically possible to survey every
individual within your target population.

Therefore, a sample of this population is


selected for the survey. The results of this
sample are then "applied" to the target
population.

3
What is the target population?
The target population, defined by each
site, is the population to which the results
of the survey should be applicable.
Example:
Child Labour Survey (CLS)
The primary objective is to measure the
prevalence of child labour. The target
population of a survey with this type
objectives is the total population of children
exposed to the risk of child labour.
4
What is a sample?
 A sample is a carefully selected subset of
the target population.

Example:
Child Labour Survey (CLS)
The total population of children exposed to
the risk of child labour is 10,000. A sample
of 2000 children will be used for the
survey to draw conclusion about the
population.
5
Requirements of a Good Sample

 The dual characteristics of a good sample


are:

◦ (i) randomness

◦ (ii) representativeness – of the target population

6
CONCEPTS OF STATISTICAL
INFERENCE
 The world is full of uncertainties, also known as
randomness, which appears almost everywhere, such
as weather forecast, air pollution, stock market
movement, economic indices, crop yields, quality of
products, supply and demand, traffic flow, effects of
medical treatments, and so on.

 The idea of statistical inference is to draw sensible


and accurate information through uncertainties by
analyzing data collected from the real-world.

7
Approaches to quantitative sampling...

1. Random: allows a procedure


governed by chance to select the
sample; controls for sampling bias
2. Nonrandom (“non-probability”): does
not have random sampling at any
state of the sample selection;
increases probability of sampling bias

8
Classification of Sampling Methods

Sampling
Methods

Probability Non-
Samples probability

Systematic Stratified Convenience Snowball

Simple
Cluster Judgment Quota
Random
9
Random (Probability) Sampling
Methods
 Sampling methods that allow us to know in
advance how likely it is that any element of a
population will be selected for the sample &
THAT EVERY ELEMENT HAS SOME KNOWN
CHANCE OF BEING SELECTED are
termed probability sampling methods.

10
Non-Probability Sampling
 Samples wherein the odds (chances) of
each element being selected are not
known are called NON
PROBAILITY SAMPLES.

11
Probability Sampling

12
Convenience sampling
 The researcher has the freedom to choose whoever is
available for inclusion in the sample. Though not a reliable
research design, convenience sampling is often the easiest
 to constitute and effect.
◦ Examples include the selection of friends and neighbours who are
easy to locate and convenient to poll.

 While the findings emerging from convenience samples may


not be precise and lack reliability, they may be useful, for
example, in the exploratory stage of studies with multiple
data gathering procedures and combined research
strategies.

 Generally, however, this strategy has little more to


commend or justify it as a sampling strategy, as it lacks a
sound research ethic and validity.
13
Judgement sampling
 Judgement sampling is a form of purposive
sampling. Purposive sampling, in turn, is a
generic term that is used to describe any
sample which is deliberately chosen by the
researcher in accordance with predetermined
non-probability criteria.

◦ For example, ina study of computer hardware


trends, the research may want to interview only
those with fairly wide experience in the field.
14
Quota sampling
 The study may require a sample stratified on
certain variables but for some valid reason
proportional stratified sampling is not possible. In
this situation, quota sampling may be an
appropriate choice to improve the
representativeness of the study.

 Here the population is divided proportionately


into predetermined categories (for example, on
the basis of population demography) and the
subjects in each category are deliberately selected
from the population until a particular quota has
been met for each category..
15
SAMPLING DISTRIBUTIONS
AND
ESTIMATION

16
ESTIMATION
 How to obtain sensible information (inference) about
the population from the data?
◦ A simple but widely applied model is to assume that the
data x1, . . . ,xn are the observed values of independent random
variables x1, . . . ,xn following a common distribution.
◦ A probability distribution is then assumed as the common
distribution of x1, . . . ,xn .
◦ In this case we say that x1, . . . ,xn are independently and
identically distributed (i.i.d.), and call x1, . . . ,xn , or their
observed values x1, . . . ,xn , a random sample of size n.
◦ This i.i.d. (probability) model is appropriate in the situations
where the data are collected randomly in the sense that each
individual in the population has equal chance to be selected
for observation.
17
ESTIMATION….cont.
 A main purpose of statistics is to get information on
the population by observing and analyzing a small
portion of the population (sample).

◦ For example, to know the voting intentions of a


population (usually in millions or more), political analysts
conduct opinion polls on random samples of voters (with
sizes typically in hundreds or a few thousands). If the
polls and analyses are conducted properly, the predictions
of voting results are often quite accurate.

18
Population Mean
 Population mean =
𝑆𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑡ℎ𝑒 𝑣𝑎𝑙𝑢𝑒𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛

 The mean of the population using mathematical


symbols is:
σ𝑋
µ=
𝑁
◦ where:
 µ represents the population mean.
 N is the number of values in the population.
 X represents any particular value.
 ΣX is the sum of the X values in the population.

19
Population Variance
 The formulas for the population variance and the
sample variance are slightly different.
 The population variance is found by:

σ(𝑋− 𝜇 )2 σ 𝑋2
σ2 = = - 𝜇2
𝑁 𝑁
◦ where:
 σ2 is the population variance
 X is the value of an observation in the population.
 µ is the arithmetic mean of the population.
 N is the number of observations in the population

 Note:
◦ For populations whose values are near the mean, the
variance will be small. For populations whose values are
dispersed from the mean, the population variance will be
large.
20
Sample Mean & Variance
 A sample is representative of the population.

σ𝑥
 Mean = 𝑥ഥ =
𝑛
σ(𝑥 − 𝑥ҧ )2
 Sample Variance = s2 =
𝑛−1
◦ s2 is referred as the unbiased estimator of the
population variance
 Where
◦ x = each score in sample
◦ 𝑥ഥ = the mean of sample
◦ n = number of observations in sample
21
Sampling distribution of the mean
 The distribution of all possible sample means
and their related probability is called the
sampling distribution of the means.

22
Central limit theorem
 If a random sample of n observation is selected
from any population, then, when the sample size is
sufficiently large (n≥30) the sampling distribution
of the mean tends to approximate the normal
distribution.

 The larger the sample size, n, the better will be the


normal approximation to the sampling distribution
of the mean.

 Can be shown that the mean of the sample means


is same as population mean, and the standard
error of the mean is smaller than the population
standard deviation (previous slide) 23
Central limit theorem
 Advantage
◦ Sample data drawn from populations not normally
distributed or from populations of unknown shape also
can be analyzed by using the normal distribution,
because the sample means are normally distributed
for sample sizes of n≥30.

 If sample means are normally distributed, the Z


score equation applied to sample means would
be:

24
25
Example 1
 You are the director of transportation safety for the
state of Georgia. You are concerned because the
average highway speed of all trucks may exceed the 60
mph speed limit. A random sample of 120 trucks show
a mean speed of 62 mph.

◦ Assuming that the population mean is 60 mph and population


standard deviation is 12.5 mph, find the probability that the
average speed is greater than or equal to 62 mph.

26
Example 1- Solution

 P(𝑋ഥ > 62) = P(Z > 1.75) = 0.5 -0.4599 = 0.041


or 4.1%
 That is, 4.1% of the time, a random sample of
120 trucks from the population will yield a
mean speed of 62 mph or more.
27
Estimates
 A point estimate is a single value (statistic) used to estimate
a population value (parameter).

 A point estimate does not tell the precision of the estimate. To


have some idea on how precise that an estimate is, we
introduce the concept of confidence interval (abbreviated to
CI), which is referred to as interval estimation

 An Interval Estimate states the range within which a


population parameter probably lies.

 A confidence interval is a range of values within which the


population parameter is expected to occur.

 The two confidence intervals that are used extensively are


the 95% and the 99%.
28
Confidence Interval
 For a 95% confidence interval, about 95% of
the similarly constructed intervals will contain
the parameter being estimated.
 95% of the sample means for a specified sample
size will lie within 1.96 standard deviations of
the hypothesized population mean.

29
Confidence Interval
 For the 99% confidence interval, 99% of the
sample means for a specified sample size will lie
within 2.58 standard deviations of the
hypothesized population mean.

 In general,

30
Margin of error & Interval estimate
 A point estimator cannot be expected to provide the
exact value of the population parameter.

 An interval estimate can be computed by adding and


subtracting a margin of error to the point estimate.

◦ The purpose of an interval estimate is to provide information


about how close the point estimate is to the value of the
parameter.

 The general form of an interval estimate of a


population mean is
31
Interval Estimation of a Population Mean (µ)
◦ When the population standard deviation, σ is known and
irrespective of sample size.

◦ If we do not know σ, we use ‘s’ (the standard deviation of the


sample) as an estimate for σ(for only large samples, n≥30 )
𝒔
ഥ ± 𝒛𝜸/𝟐
𝒙
√𝒏
 where:
 µ: is the sample mean
 Z α/2: is the z value providing an area of α/2 in the upper tail of the
standard normal probability distribution
 σ: is the population standard deviation
 s: is the sample standard deviation
 n: is the sample size 32
Interval Estimation of a Population Mean (µ)
 Small Sample ( n<30 ) & σ is not known
𝒔
ഥ ± 𝒕(𝒏−𝟏,𝜸/𝟐)
𝒙
√𝒏
 where:
◦ t (n-1, a/2) = the t value providing an area of α/2 in the upper
tail of a t distribution with n - 1 degrees of freedom
◦ s = the sample standard deviation

33
Example 2
 (A) At a factory, batteries are produced with a
standard deviation of 2.4 months. In a sample of 64
batteries, the mean life expectancy is 12.35.
◦ Find a 95% confidence interval estimate for the life
expectancy of all batteries produced at the plant.

 (B) Discount Sounds has 260 retail outlets


throughout the United States. The firm is
evaluating a potential location for a new outlet,
based in part, on the mean annual income of the
individuals in the marketing area of the new
location. A sample of size 36 was taken; the sample
mean income is $31,100. The population standard
deviation is estimated to be $4,500.
◦ Obtain a 95% confidence interval estimate.
34
Example 2 (A)- Solution

35
Example 2 (B)- Solution

36
Example 3
 A reporter for a student newspaper is writing
an article on the cost of off-campus housing. A
sample of 16 efficiency apartments within a
half-mile of campus resulted in a sample mean
of $650 per month and a sample standard
deviation of $55.

◦ Calculate a 95% confidence interval estimate of the


mean rent per month for the population of
efficiency apartments within a half-mile of campus.

37
Example 3- Solution

38
Margin of Error & Sample Size
◦ Let E = the desired margin of error.
◦ E is the amount added to and subtracted from the
point estimate to obtain an interval estimate.

 When the population standard deviation, σ is known and


irrespective of sample size.

 Small Sample ( n<30 ) & σ is not known

39
Example 4
 A firm is evaluating a potential location for a
new outlet, based in part, on the mean annual
income of the individuals in the marketing area
of the new location. The population standard
deviation is estimated to be $4,500. Suppose
that the firm management team wants an
estimate of the population mean such that
there is a 0.95 probability that the sampling
error is $500 or less.

◦ How large a sample size is needed to meet the required


precision?
40
Example 4-Solution

41
Confidence Interval for Sample Proportion
 The following formula is used for confidence
interval for sample proportion:

𝑝𝑞
p ± Zα/2
𝑛

𝑝𝑞
 Margin of Error, E = Zα/2
𝑛

42
Example 5
 Suppose that you are in charge to see if
dropping a computer will damage it. You want
to find the proportion of computers that break.
If you want a 90% confidence interval for
this proportion, with a margin of error of
4%,
◦ How many computers should you drop?

43
Example 5-Solution

44

Вам также может понравиться