Вы находитесь на странице: 1из 20

CHAPTER 7 Introduction .

179

The Logic of
TABLE 7-1
Election Eve Polls Reporting Percentage of Population Voting for U.S. Presidential Candidates, 2000

o Sampling
Date Agency Gore Bush Nader Buchanan*
non

0
11/5 Hotline [Polling Co/GSG ] 43% 51% 4% 1%

11/5 Marist College 46 51 2 1


0 0
0 0 0,
11/5 Fox [Opinion Dynamics] 47 47 3 2


0000000 O 0

11/5 Newsweek[PRSA] 46 49 6 0
O aaQ
Chapter Overview
O
0 -0.0. e 11/5 NBC/Wall St.Joumal [Hart/Teeter] 45 48 4 2
U
0000 ; 00000 11/5 Pew 46 49 3 1
Now you'll see how social scientists can select
0 TO

11/5 I CR 44 46 7 2
a few people for study-and discover things
0 ob
11/5 Hams 47 47 5 1
o

that apply to hundreds of millions of Hams (online) 47 47 4 2


0
11/5
people not studied.
0
0 0 11/5 ABC/Washington Post [TNSI ] 46 49 3 1
11/6 I DB/CSM [TIPP ] 47 49 4 0
11/6 CBS 48 47 4 1

11/6 Portrait of America[Rasmussen] 43 52 4 1

11/6 CNN/USA Today [Gallup] 46 48 4 1

11/6 Reuters/MSNBC[Zogby] 48 46 5 1

I ntroduction Types of Sampling Designs 11/6 Voter.com [ Lake/Goeas] 45 51 4 0


11/7 Election Results 48 48 3 1
Simple Random Sampling
A Brief History of Sampling Systematic Sampling `'Dont knows" have been apportioned so the totals equal 100%. (Rounding error may result in totals of 99% or 101%.)
President Alf Landon Source: Adapted from Robert Worcester, WAPoRNewsletter, Winter 2001.
Stratified Sampling
President Thomas E. Dewey
Implicit Stratification in Systematic Sampling
Two Types of Sampling Methods
Illustration: Sampling University Students
em polling doesn't have to take all the suspense

Nonprobability Sampling
I ntroduction
Multistage Cluster Sampling
out of elections.
One of the most visible uses of survey sampling lies Now, how many interviews do you suppose it
Reliance on Available Subjects
Multistage Designs and Sampling Error in the political polling that is subsequently tested by took each of these pollsters to come within a couple
Purposive or Judgmental Sampling
Stratification in Multistage Cluster Sampling the election results. While some people doubt the of percentage points in estimating the behavior of
Snowball Sampling accuracy of sample surveys, others complain that about a hundred million voters? Often fewer than
Probability Proportionate to Size (PPS)
Quota Sampling Sampling political polls take all the suspense out of campaigns 2,000! In this chapter, we're going to find out how
Selecting Informants by foretelling the result. social researchers can pull off such wizardry.
Disproportionate Sampling and Weighting
Going into the 2000 presidential elections, how- For another powerful illustration of the potency
The Theory and Logic of Probability Sampling Probability Sampling in Review ever, pollsters generally agreed that the election was of sampling, look at this graphic portrayal of Presi-
'too close to call.' Robert Worcester has compiled dent George W. Bush's approval ratings prior to and
Conscious and Unconscious Sampling Bias
the national polls completed during the two days following the September 11, 2001, terrorist attack
Representativeness and Probability MAIN POINTS
before the election. Despite some variations, the on the U.S . (see Figure 7-1). The data reported by
of Selection
overall picture they present is amazingly consistent. several different polling agencies describe the same
KEY TERMS
Random Selection
As we now know, the election was so dose that pattern.
Probability Theory, Sampling Distributions, Political polling, like other forms of social re-
REVIEW QUESTIONS AND EXERCISES even the election officials were unable to declare
and Estimates of Sampling Error the result unambiguously, and the matter had to search, rests on observations. But neither pollsters
ADDITIONAL READINGS be settled in the Supreme Court. That cliff-hanger nor other social researchers can observe everything
Populations and Sampling Frames proved once and for all that the accuracy of mod- that might be relevant to their interests. A critical
178 Review of Populations and Sampling Frames RESOURCES ON THE INTERNET

180 . Chapter 7: The Logic of Sampling A Brief History of Sampling . 181

FIGURE 7-1 A Brief History of Sampling Two weeks later, the Digest editors knew the lim-
Bush Approval: Raw Poll Data itations of straw polls even better: The voters gave
Sampling in social research has developed hand in Roosevelt a second term in office by the largest land-
hand with political polling. This is the case, no slide in history, with 61 percent of the vote. Landon
100 doubt, because political polling is one of the few won only 8 electoral votes to Roosevelt's 523.
opportunities social researchers have to discover The editors were puzzled by their unfortunate
90 the accuracy of their estimates. On election day, turn of luck. A part of the problem surely lay in the
they find out how well or how poorly they did. 22 percent return rate garnered by the poll. The
editors asked,
80
Why did only one in five voters in Chicago to
70 PresidentAifLandon whom the Digest sent ballots take the trouble to
reply? And why was there a preponderance of
President Alf Landon? Who's he? Did you sleep
Republicans in the one-fifth that did reply? ...
through an entire presidency in your U.S. history
I
60
~9
We were getting better cooperation in what we
M
ft
* w
!
class? No-but Alf Landon would have been presi-
have always regarded as a public service from
dent if a famous poll conducted by the Literary Di-
50 Republicans than we were getting from Demo-
gest had proved to be accurate. The Literary Digest

0 0
crats. Do Republicans live nearer to mailboxes?
0 0 00

I
was a popular newsmagazine published between
0
Do Democrats generally disapprove of straw
r r r
N N N N N N N N N
1890 and 1938. In 1920, Digest editors mailed post-
p 0 0
40 polls?
$ 0 0 0 p
N N N N N N N N N N N N N N N N N N N N N
cards to people in six states, asking them whom
(Literary Digest 1936b:7)

N N N O N ON O N N O O O O O O O O O O O
they were planning to vote for in the presidential
N N n
O O O N O N O O O N
c~ campaign between Warren Harding and James Cox. Actually, there was a better explanation-what
m v u~ ~ ~ ~ rn O
Names were selected for the poll from telephone
ABC/Post Bloomberg CBS A CNN/Time Fox Gallup Harris 0 I BDICSM
is technically called the sampling frame used by the
A directories and automobile registration lists. Based Digest. In this case the sampling frame consisted of

NBC/WSJ Newsweek o Pew 0 Zogby 0 AmResGp


A on the postcards sent back, the Digest correctly pre-
A Ipsos-Reid telephone subscribers and automobile owners. In
dicted that Harding would be elected. In the elec- the context of 1936, this design selected a dispro-
tions that followed, the Literary Digest expanded the portionately wealthy sample of the voting popula-
Source: Copyright 2001,2002 by ddimedck.com .Ali rights reserved.Available: http://www.polikitz.homestead.comtfilesIMyHTMt2.gf. size of its poll and made correct predictions in 1924, tion, especially coming on the tail end of the worst
1928, and 1932. economic depression in the nation's history. The
In 1936, the Digest conducted its most ambi- sample effectively excluded poor people, and the
tious poll: Ten million ballots were sent to people poor voted predominantly for Roosevelt's New Deal
part of social research, then, is deciding what to ob- basic logic of sampling is not difficult to understand. listed in telephone directories and on lists of auto- recovery program. The Digest's poll may or may not
serve and what not. If you want to study voters, for In fact, the logical neatness of this topic can make it mobile owners. Over two million people responded, have correctly represented the voting intentions of
example, which voters should you study? easier to comprehend than, say, conceptualization. giving the Republican contender, Alf Landon, a telephone subscribers and automobile owners. Un-
The process of selecting observations is called Although probability sampling is central to so- stunning 57 to 43 percent landslide over the in- fortunately for the editors, it decidedly did not rep-
sampling. Although sampling can mean any proce- cial research today, we'll take some time to exam- cumbent, President Franklin Roosevelt. The editors resent the voting intentions of the population as a
dure for selecting units of observation-for ex- ine a variety of nonprobability methods as well. modestly cautioned, whole.
ample, interviewing every tenth passerby on a busy These methods have their own logic and can pro-
street -the key to generalizing from a sample to a vide useful samples for social inquiry. We make no claim to infallibility. We did not

President Thomas E. Dewey


larger population is probability sampling, which in- Before we discuss the two major types of sam- coin the phrase "uncanny accuracy" which has
volves the important idea of random selection. pling, I'll introduce you to some basic ideas by way been so freely applied to our Polls. We know
Much of this chapter is devoted to the logic and of a brief history of sampling. As you'll see, the only too well the limitations of every straw The 1936 election also saw the emergence of a
skills of probability sampling. This topic is more rig- pollsters who correctly predicted the election cliff- vote, however enormous the sample gathered, young pollster whose name would become synony-
orous and precise than some of the other topics in hanger of 2000 did so in part because researchers however scientific the method. It would be a mous with public opinion. In contrast to the Liter-
this book. Whereas social research as a whole is had learned to avoid some pitfalls that earlier poll- mirade if every State of the forty-eight behaved ary Digest, George Gallup correctly predicted that
both art and science, sampling leans toward science. sters had avoided. on Election Day exactly as forecast by the Poll. Roosevelt would beat Landon. Gallup's success in
Although this subject is somewhat technical, the (Literary Digest 1936a:6) 1936 hinged on his use of something called quota
18 2 . Chapter 7: The Logic of Sampling NonprobabilltySampling . 183

sampling, which we'll look at more closely later national political polls, such information came pri- sampling, and quota sampling. We'll conclude with Minnesota. Who were the physicians who attended
in the chapter. For now, it's enough to know that marily from census data. By 1948, however, World a brief discussion of techniques for obtaining in- the course? We can guess that they were probably
quota sampling is based on a knowledge of the War B had produced a massive movement from the formation about social groups through the use of more concerned about their continuing education
characteristics of the population being sampled: country to cities, radically changing the character informants. than were other physicians, but we can't say for
what proportion are men, what proportion are of the U.S. population from what the 1940 census sure. While such studies can be the source of useful
women, what proportions are of various incomes, showed, and Gallup relied on 1940 census data. insights, we must take care not to overgeneralize
ages, and so on. Quota sampling selects people to City dwellers, moreover, tended to vote Democratic; Reliance on Available Subjects from them.
match a set of these characteristics: the right num- hence, the overrepresentation of rural voters in his
Relying on available subjects, such as stopping
ber of poor, white, rural men; the right number of poll had the effect of underestimating the number
rich, African-American, urban women; and so on. of Democratic votes.
people at a street comer or some other location, is
Purposive orludgmental Sampling
an extremely risky sampling method; even so, it's
The quotas are based on those variables most rele-
used all too frequently. Clearly, this method does Sometimes it's appropriate to select a sample on the
vant to the study. In the case of Gallup's poll, the
Two Types of Sampling Methods
not permit any control over the representative- basis of knowledge of a population, its elements,
sample selection was based on levels of income; the
ness of a sample. It's justified only if the researcher and the purpose of the study. This type of sampling
selection procedure ensured the right proportion of
By 1948, some academic researchers had already wants to study the characteristics of people passing is called purposive or judgmental sampling. In
respondents at each income level.
been experimenting with a form of sampling based the sampling point at specified times or if less risky the initial design of a questionnaire, for example,
Gallup and his American Institute of Public
on probability theory. This technique involves the sampling methods are not feasible. Even when this you might wish to select the widest variety of re-
Opinion used quota sampling to good effect in 1936,
selection of a "random sample" from a list contain- method is justified on grounds of feasibility, re- spondents to test the broad applicability of ques-
1940, and 1944-correctly picking the presidential
ing the names of everyone in the population being searchers must exercise great caution in generaliz- tions. Although the study findings would not rep-
winner each of those years. Then, in 1948, Gallup
sampled. By and large, the probability sampling ing from their data. Also, they should alert readers resent any meaningful population, the test run
and most political pollsters suffered the embarrass-
methods used in 1948 were far more accurate than to the risks associated with this method. might effectively uncover any peculiar defects in
ment of picking Governor Thomas Dewey of New
quota sampling techniques. University researchers frequently conduct sur- your questionnaire. This situation would be consid-
York over the incumbent, President Harry Truman.
Today, probability sampling remains the pri- veys among the students enrolled in large lecture ered a pretest, however, rather than a final study.
The pollsters' embarrassing miscue continued right
mary method of selecting large, representative classes. The ease and frugality of such a method ex- In some instances, you may wish to study a
up to election night. A famous photograph shows a
samples for social research, including national po- plains its popularity, but it seldom produces data of small subset of a larger population in which many
jubilant Truman-whose followers' battle cry was
litical polls. At the same time, probability sampling any general value. It may be useful for pretesting a members of the subset are easily identified, but the
"Give 'em hell, Harry!"-holding aloft a newspaper
can be impossible or i nappropriate in many research questionnaire, but such a sampling method should enumeration of them all would be nearly impos-
with the banner headline "Dewey Defeats Truman."
situations. Accordingly, before turning to the logic not be used for a study purportedly describing stu- sible. For example, you might want to study the
Several factors accounted for the pollsters' fail-
and techniques of probability sampling, we'll first dents as a whole. leadership of a student protest movement; many
ure in 1948. First, most pollsters stopped polling in
take a look at techniques for nonprobability sam- Consider this report on the sampling design in of the leaders are easily visible, but it would not
early October despite a steady trend toward Tru-
pling and how they're used in social research. an examination of knowledge and opinions about be feasible to define and sample all the leaders. In
man during the campaign. In addition, many vot-
nutrition and cancer among medical students and studying all or a sample of the most visible leaders,
ers were undecided throughout the campaign, and
family physicians: you may collect data sufficient for your purposes.
these went disproportionately for Truman when
Or let's say you want to compare left-wing and
they stepped into the voting booth.
More important, Gallup's failure rested on the
Nonprobability Sampling The fourth-year medical students of the Uni- right-wing students. Because you may not be able
versity of Minnesota Medical School in Min- to enumerate and sample from all such students,
unrepresentativeness of his samples. Quota sam- Social research is often conducted in situations that neapolis comprised the student population in you might decide to sample the memberships of
pling-which had been effective in earlier years- do not permit the kinds of probability samples used this study. The physician population consisted left- and right-leaning groups, such as the Green
was Gallup's undoing in 1948. This technique re- in large-scale social surveys. Suppose you wanted of all physicians attending a "Family Practice Party and the Young Americans for Freedom. Al-
quires that the researcher know something about to study homelessness: There is no list of all home- Review and Update" course sponsored by the though such a sample design would not provide a
the total population (of voters in this instance). For less individuals, nor are you likely to create such a University of Minnesota Department of Con- good description of either left-wing or right-wing
list. Moreover, as you'll see, there are times when tinuing Medical Education.
probability sampling wouldn't be appropriate even
nonprobability sampling Any technique in which if it were possible. Many such situations call for
(Cooper-Stephenson and Theologides 1981:472)

samples are selected in some way not suggested by purposive (judgmental) sampling A type of non-
nonprobability sampling. After all is said and done, what will the results
probability theory. Examples i nclude reliance on avail- probability sampling in which you select the units to
In this section, we'll examine four types of of this study represent? They do not provide a
able subjects as well as purposive (judgmental), quota, be observed on the basis of your own judgment about
nonprobability sampling: reliance on available sub- meaningful comparison of medical students and
and snowball sampling. which ones will be the most useful or representative.
jects, purposive or judgmental sampling, snowball family physicians in the United States or even in
185
184 . Chapter7: The Logic of Sampling
Nonprobability Sampling .

asking them who introduced them to the group. to get up-to-date information for this purpose. The who could understand what you were looking for
students as a whole, it might suffice for general
You might then interview the people named, ask- Gallup failure to predict Truman as the presidential and help you find it.
comparative purposes.
ing them who introduced them to the group. You victor in 1948 was due partly to this problem. Sec- When Jeffrey Johnson (1990) set out to study
Field researchers are often particularly inter-
might then interview those people named, asking, ond, the selection of sample elements within a a salmon-fishing community in North Carolina, he
ested in studying deviant cases-cases that don't fit
into fairly regular patterns of attitudes and behav- in part, who introduced them. Or, in studying a given cell may be biased even though its proportion used several criteria to evaluate potential infor-
loosely structured political group, you might ask of the population is accurately estimated. Instructed mants. Did their positions allow them to interact
iors -in order to improve their understanding of
the more regular pattern. For example, you might one of the participants who he or she believes to to interview five people who meet a given, com- regularly with other members of the camp, for ex-
gain important insights into the nature of school be the most influential members of the group. You plex set of characteristics, an interviewer may still ample, or were they isolated? (He found that the
might interview those people and, in the course of avoid people living at the top of seven-story walk- carpenter had a wider range of interactions than
spirit, as exhibited at a pep rally, by interviewing
the interviews, ask who they believe to be the most ups, having particularly run-down homes, or own- did the boat captain.) Was their information about
people who did not appear to be caught up in the
emotions of the crowd or by interviewing students influential. In each of these examples, your sample ing vicious dogs. the camp pretty much limited to their specific jobs,
who did not attend the rally at all. Selecting de- would "snowball" as each of your interviewees sug- In recent years, attempts have been made to or did it cover many aspects of the operation? These
viant cases for study is another example of purpo- gested other people to interview. combine probability and quota sampling methods, and other criteria helped determine how useful the
but the effectiveness of this effort remains to be potential informants might be.
sive study.
seen. At present, you would be advised to treat Usually, you'll want to select informants some-

Quota Sampling quota sampling warily if your purpose is statistical what typical of the groups you're studying. Other-

Snowball Sampling description. wise, their observations and opinions may be mis-
Quota sampling is the method that helped George
At the same time, the logic of quota sampling leading. Interviewing only physicians will not give
Another nonprobability sampling technique, which Gallup avoid disaster in 1936-and set up the dis-
can sometimes be applied usefully to a field re- you a well-rounded view of how a community
some consider to be a form of accidental sampling, aster of 1948. Like probability sampling, quota
search project. In the study of a formal group, for medical clinic is working, for example. Along the
is called snowball sampling. This procedure is ap- sampling addresses the issue of representativeness,
example, you might wish to interview both leaders same lines, an anthropologist who interviews only
propriate when the members of a special population although the two methods approach the issue quite
and nonleaders. In studying a student political or- men in a society where women are sheltered from
are difficult to locate, such as homeless individuals, differently.
ganization, you might want to interview radical, outsiders will get a biased view. Similarly, while
migrant workers, or undocumented immigrants. In Quota sampling begins with a matrix, or table,
moderate, and conservative members of that group. informants fluent in English are convenient for
snowball sampling, the researcher collects data on describing the characteristics of the target popula-
You may be able to achieve sufficient representa- English-speaking researchers from the United
the few members of the target population he or she tion. Depending on your research purposes, you
tiveness in such cases by using quota sampling to States, they do not typify the members of many
can locate, then asks those individuals to provide may need to know what proportion of the popula-
ensure that you interview both men and women, societies and even many subgroups within English-
the information needed to locate other members tion is male and what proportion female as well as
both younger and older people, and so forth. speaking countries.
of that population whom they happen to know. what proportions of each gender fall into various
Simply because they're the ones willing to work
"Snowball" refers to the process of accumulation age categories, educational levels, ethnic groups,
with outside investigators, informants will almost
as each located subject suggests other subjects. Be- and so forth. In establishing a national quota sample,
Selecting Informants always be somewhat "marginal" or atypical within
cause this procedure also results in samples with you might need to know what proportion of the
their group. Sometimes this is obvious. Other times,
questionable representativeness, it's used primarily national population is urban, eastern, male, under When field research involves the researcher's at-
however, you'll learn about their marginality only
for exploratory purposes. 25, white, working class, and the like, and all the tempt to understand some social setting-a juve-
in the course of your research.
Suppose you wish to learn a community orga- possible combinations of these attributes. nile gang or local neighborhood, for example-
In Jeffrey Johnson's study, the county agent
Once you've created such a matrix and assigned much of that understanding will come from a
nization's pattern of recruitment over time. You identified one fisherman who seemed squarely in
might begin by interviewing fairly recent recruits, a relative proportion to each cell in the matrix, you collaboration with some members of the group be-
the mainstream of the community. Moreover, he
proceed to collect data from people having all the ing studied. Whereas social researchers speak of re-
was cooperative and helpful to Johnson's research.
characteristics of a given cell. You then assign to all spondents as people who provide information about
The more Johnson worked with the fisherman,
the people in a given cell a weight appropriate to themselves, allowing the researcher to construct a
snowball sampling A nonprobability sampling however, the more he found the man to be a mar-
their portion of the total population. When all the composite picture of the group those respondents
method often employed in field research whereby each ginal member of the fishing community.
person interviewed may be asked to suggest additional sample elements are so weighted, the overall data represent, an informant is a member of the group
people for interviewing. should provide a reasonable representation of the who can talk directly about the group per se.
quota sampling A type of nonprobability sampling in total population. Especially important to anthropologists, in-
informant Someone well versed in the social phe-
which units are selected into a sample on the basis of Although quota sampling resembles probability formants are important to other social researchers
nomenon that you wish to study and who is willing to
prespecified characteristics, so that the total sample will sampling, it has several inherent problems. First, as well. If you wanted to learn about informal so-
tell you what he or she knows about it. Not to be con-
have the same distribution of characteristics assumed cial networks in a local public housing project, for
the quota frame (the proportions that different cells fused with a respondent.
to exist in the population being studied.
represent) must be accurate, and it is often difficult example, you would do well to locate individuals
The Theory and Logic of Probability Sampling . 187
1 g6 . Chapter 7: The Logic of Sampling

First, he was a Yankee in a southern town. Sec- The Theory and Logic FIGURE 7-2

of Probability Sampling
ond, he had a pension from the Navy [so he A Population of 100 Folks
was not seen as a "serious fisherman" by others
in the community].... Third, he was a major While appropriate to some research purposes, non- 44 white women
Republican activist in a mostly Democratic vil- 44 white men
probability sampling methods cannot guarantee
6 African-American women
lage. Finally, he kept his boat in an isolated an-
that the sample we observed is representative of 6 African-American men
chorage, far from the community harbor.
the whole population. When researchers want pre-
(1990:56)
cise, statistical descriptions of large populations-

nformants' marginality may not only bias the view for example, the percentage of the population who
you get, but their marginal status may also limit are unemployed, plan to vote for Candidate X, or
their access (and hence yours) to the different sec- feel a rape victim should have the right to an abor-
tors of the community you wish to study. tion-they turn to probability sampling. All

These comments should give you some sense of large-scale surveys use probability-sampling
the concerns involved in nonprobability sampling, methods.
typically used in qualitative research projects. I con- Although the application of probability sam-
dude with the following injunction: pling involves some sophisticated use of statistics,
the basic logic of probability sampling is not diffi-
Your overall goal is to collect the richest possible cult to understand. If all members of a population
data. Rich data mean, ideally, a wide and di-
were identical in all respects-all demographic
verse range of information collected over a rel-
characteristics, attitudes, experiences, behaviors,
atively prolonged period of time. Again, ideally,
and so on-there would be no need for careful
you achieve this through direct, face-to-face sampling procedures. In this extreme case of per-
contact with, and prolonged immersion in,
fect homogeneity, in fact, any single case would
some social location or circumstance.
suffice as a sample to study characteristics of the
(Lofland and Lofland 1995:16) whole population.

In other words, nonprobability sampling does In fact, of course, the human beings who com-
have its uses, particularly in qualitative research pose any real population are quite heterogeneous,
projects. But researchers must take care to ac- varying in many ways. Figure 7-2 offers a simpli-
researcher (in the upper-right corner) happen to
Conscious and Unconscious Sampling Bias
knowledge the limitations of nonprobability sam- fied illustration of a heterogeneous population: The
be 70 percent women, and although the popula-
pling, especially regarding accurate and precise rep- 100 members of this small population differ by gen-
At first glance, it may look as though sampling is tion is 12 percent black, none was selected into the
resentations of populations. This point will become der and race. We'll use this hypothetical micro-
pretty straightforward. To select a sample of 100 sample.
dearer as we discuss the logic and techniques of population to illustrate various aspects of probabil-
university students, you might simply interview Beyond the risks inherent in simply studying
probability sampling. ity sampling.
the first 100 students you find walking around people who are convenient, other problems can
As you can see, choosing and using informants The fundamental idea behind probability sam-
campus. This kind of sampling method is often arise. To begin with, the researcher's personal lean-
can be a tricky business. To see some practical im- pling is this: To provide useful descriptions of the
used by untrained researchers, but it runs a very ings may affect the sample to the point where it
plications of doing so, you can visit the Web site of total population, a sample of individuals from a
high risk of introducing biases into the samples. does not truly represent the student population.
Canada's Community Adaptation and Sustainable population must contain essentially the same varia-
In connection with sampling, bias simply Suppose you're a little intimidated by students who
tions that exist in the population. This isn't as simple
Livelihoods (CASL) Program: http://iisd.ca/casl/ means that those selected are not typical or repre- look particularly "cool," feeling they might ridicule
CASLGuide/KeylnformEx.htm. as it might seem, however. Let's take a minute to
sentative of the larger populations they have been your research effort. You might consciously or un-
look at some of the ways researchers might go
chosen from. This kind of bias does not have to be consciously avoid interviewing such people. Or,
astray. Then, we'll see how probability sampling
intentional. In fact, it is virtually inevitable when you might feel that the attitudes of "super-straight-
provides an efficient method for selecting a sample
you pick people by the seat of your pants. looking" students would be irrelevant to your re-
probability sampling The general term for samples
that should adequately reflect variations that exist
selected in accord with probability theory, typically in- Figure 7-3 illustrates what can happen when search purposes and so avoid interviewing them.
volving some random-selection mechanism. Specific in the population.
researchers simply select people who are conve- Even if you sought to interview a "balanced"
types of probability sampling include EPSEM, PPS, group of students, you wouldn't know the exact
nient for study. Although women are only 50 per-
si mple random sampling and systematic sampling.
cent of our micropopulation, those dosest to the proportions of different types of students making
188 . Chapter 7: The Logic of Sampling [he Theory and Logic of Probability Sampling . 189

FIGURE 7-3 The possibilities for inadvertent sampling bias Second, and more important, probability theory
are endless and not always obvious. Fortunately permits us to estimate the accuracy or representa-
A Sample of Convenience: Easy, but Not Representative

d
there are techniques that help us avoid bias. tiveness of the sample. Conceivably, an uninformed
researcher might, through wholly haphazard
means, select a sample that nearly perfectly repre-
Representativeness and sents the larger population. The odds are against

Probability of Selection doing so, however, and we would be unable to esti-


The mate the likelihood that he or she has achieved
sample Although the term representativeness has no
representativeness. The probability sampler, on the
precise, scientific meaning, it carries a common-
other hand, can provide an accurate estimate of
sense meaning that makes it useful here. For our
success or failure. We'll shortly see exactly how this
purpose, a sample is representative of the popula-
~y ~MjMA~d~ tion from which it is selected if the aggregate char-
estimate can be achieved.

acteristics of the sample closely approximate those


I've said that probability sampling ensures that

7~ A
up same aggregate characteristics in the population.
samples are representative of the population we

If, for example, the population contains 50 percent


wish to study. As we'll see in a moment, probability
16 11
1l women, then a sample must contain "dose to"
sampling rests on the use of a random selection

50 percent women to be representative. Later, we'll


procedure. To develop this idea, though, we need

discuss "how dose" in detail.


to give more precise meaning to two important

W
Note that samples need not be representative in
terms: element and population.*
An element is that unit about which infor-
all respects; representativeness is limited to those
mation is collected and that provides the basis of
characteristics that are relevant to the substantive
analysis. Typically, in survey research, elements are
interests of the study. However, you may not know
people or certain types of people. However, other
in advance which characteristics are relevant.
kinds of units can constitute the elements for social
A basic principle of probability sampling is that
research: Families, social dubs, or corporations
a sample will be representative of the population
might be the elements of a study. In a given study,
from which it is selected if all members of the popu-
elements and units of analysis are often the same
lation have an equal chance of being selected in the
up such a balance, and you wouldn't always be able especially if doing so will cost them a stamp, an en- sample. (We'll see shortly that the size of the sample
to identify the different types just by watching velope, or a telephone charge. Similar consider- selected also affects the degree of representative- *I would like to acknowledge a debt to Leslie Kish and
ness.) Samples that have this quality are often la- his excellent textbook Survey Sampling. Although I've
them walk by. ations apply to polls taken over the Internet.
modified some of the conventions used by Kish, his
Even if you made a conscientious effort to in- Ironically, the failure of such polls to represent beled EPSEM samples (EPSEM stands for "equal
presentation is easily the most important source of this
terview, say, every tenth student entering the uni- all opinions equally was inadvertently acknowl- probability of selection method"). Later we'll dis-
discussion.
versity library you could not be sure of a represen- edged by Phillip Perinelli (1986), a staff manager cuss variations of this principle, which forms the
tative sample, because different types of students of AT&T Communications' DIAL-IT 900 Service, basis of probability sampling.
visit the library with different frequencies. Your which offers a call-in poll facility to organizations. Moving beyond this basic principle, we must re- representativeness That quality of a sample of hav-
ing the same distribution of characteristics as the popu-
sample would overrepresent students who visit the Perinelli attempted to counter criticisms by saying, alize that samples-even carefully selected EPSEM
lation from which it was selected. By implication, de-
library more often than do others. "The 50-cent charge assures that only interested samples-seldom if ever perfectly represent the
scriptions and explanations derived from an analysis of
Similarly, the "public opinion" call-in polls-in parties respond and helps assure also that no indi- populations from which they are drawn. Never- the sample may be assumed to represent similar ones
which radio stations or newspapers ask people to vidual 'stuffs' the ballot box." We cannot determine theless, probability sampling offers two special in the population. Representativeness is enhanced by
advantages. probability sampling and provides for generalizability
call specified telephone numbers to register their general public opinion while considering "only in-
and the use of inferential statistics.
opinions-cannot be trusted to represent general terested parties." This excludes those who don't care First, probability samples, although never per-
EPSEM (equal probability of selection method)
populations. At the very least, not everyone in the 50-cents' worth, as well as those who recognize fectly representative, are typically more representa-
A sample design in which each member of a population
population will even be aware of the poll. This that such polls are not valid. Both types of people tive than other types of samples, because the biases
has the same chance of being selected into the sample.
problem also invalidates polls by magazines and may have opinions and may even vote on election previously discussed are avoided. In practice, a prob-
element That unit of which a population is comprised
newspapers who publish coupons for readers to day. Perinelli's assertion that the 50-cent charge will ability sample is more likely than a nonprobability
and which is selected in a sample. Distinguished from
complete and mail in. Even among those who are prevent ballot stuffing actually means that only sample to be representative of the population from units of analysis, which are used in data analysis.
aware of such polls, not all will express an opinion, those who can afford it will engage in ballot stuffing. which it is drawn.
190 - Chapter 7: The Logic of Sampling The Theory and Logic of Probability Sampling . 191

as units of analysis, though the former are used in professors may limit the study population to those to the body of probability theory, which provides money, another has one dollar, another has two
sample selection and the latter in data analysis. in psychology departments, omitting those in other the basis for estimating the characteristics of the dollars, and so forth up to the person with nine
Up to now we've used the term population to departments. Whenever the population under ex- population as well as estimates of the accuracy of dollars. Figure 7-4 presents the population of ten
mean the group or collection that we're interested amination is altered in such fashions, you must samples. Let's now examine probability theory in people.*
in generalizing about. More formally, a population make the revisions clear to your readers. greater detail. Our task is to determine the average amount of
is the theoretically specified aggregation of study money one person has: specifically, the mean num-
elements. Whereas the vague term Americans might ber of dollars. If you simply add up the money
be the target for a study, the delineation of the popu- Random Selection Probability Theory, Sampling Distributions, shown in Figure 7-4, you'll find that the total is

and Estimates of Sampling Error


lation would include the definition of the element $45, so the mean is $4.50. Our purpose in the rest
With these definitions in hand, we can define the
Americans (for example, citizenship, residence) and of this exercise is to estimate that mean without ac-
ultimate purpose of sampling: to select a set of ele-
the time referent for the study (Americans as of Probability theory is a branch of mathematics that pro- tually observing all ten individuals. We'll do that by
ments from a population in such a way that descrip-
when?). Translating the abstract "adult New York- vides the tools researchers need to devise sampling selecting random samples from the population and
tions of those elements accurately portray the total
ers" into a workable population would require a techniques that produce representative samples using the means of those samples to estimate the
population from which the elements are selected.
specification of the age defining adult and the and to analyze the results of their sampling statisti- mean of the whole population.
Probability sampling enhances the likelihood of ac-
boundaries of New York. Specifying the term college cally. More formally, probability theory provides To start, suppose we were to select-at ran-
complishing this aim and also provides methods for
student would induce a consideration of full- and the basis for estimating the parameters of a popula- dom-a sample of only one person from the ten.
estimating the degree of probable success.
part-time students, degree candidates and non- tion. A parameter is the summary description of a Our ten possible samples thus consist of the ten
Random selection is the key to this process. In
degree candidates, undergraduate and graduate given variable in a population. The mean income of cases shown in Figure 7-4.
random selection, each element has an equal
students, and so forth. all families in a city is a parameter; so is the age dis- The ten dots shown on the graph in Figure 7-5
chance of selection independent of any other event
A study population is that aggregation of ele- tribution off the city's population. When researchers represent these ten samples. Since we're taking
in the selection process. Flipping a coin is the most
ments from which the sample is actually selected. generalize from a sample, they're using sample ob- samples of only one, they also represent the
frequently cited example: Provided that the coin is
As a practical matter, researchers are seldom in a servations to estimate population parameters. Prob- " means" we would get as estimates of the popula-
perfect (that is, not biased in terms of coming up
position to guarantee that every element meeting ability theory enables them both to make these es- tion. The distribution of the dots on the graph is
heads or tails), the "selection" of a head or a tail is
the theoretical definitions laid down actually has a timates and to arrive at a judgment of how likely called the sampling distribution. Obviously, it wouldn't
independent of previous selections of heads or tails.
chance of being selected in the sample. Even where the estimates will accurately represent the actual be a very good idea to select a sample of only one,
No matter how many heads turn up in a row, the
lists of elements exist for sampling purposes, the parameters in the population. So, for example, prob- since we stand a very good chance of missing the
chance that the next flip will produce "heads" is
lists are usually somewhat incomplete. Some stu- ability theory allows pollsters to infer from a sample true mean of $4.50 by quite a bit.
exactly 50-50. Rolling a perfect set of dice is an-
dents are always inadvertently omitted from stu- of 2,000 voters how a population of 100 million vot- Now suppose we take a sample of two. As
other example.
dent rosters. Some telephone subscribers request ers is likely to vote-and to specify exactly what shown in Figure 7-6, increasing the sample size im-
that their names and numbers be unlisted. the probable margin of error in the estimates is. proves our estimations. There are now 45 possible
Such images of random selection, while useful,

Often, researchers decide to limit their study Probability theory accomplishes these seem- samples: [$0 $1], [$0 $2].... [$7 $8], [$8 $91.
seldom apply directly to sampling methods in so-

populations more severely than indicated in the ingly magical feats by way of the concept of sam- Moreover, some of those samples produce the same
cial research. More typically, social researchers use

preceding examples. National polling firms may pling distributions. A single sample selected from a means. For example, [$0 $6], [$1 $5], and [$2 $4]
tables of random numbers or computer programs

limit their national samples to the 48 adjacent population will give an estimate of the population all produce means of $3. In Figure 7-6, the three
that provide a random selection of sampling units.

states, omitting Alaska and Hawaii for practical rea-


A sampling unit is that element or set of elements
parameter. Other samples would give the same or dots shown above the $3 mean represent those
sons. A researcher wishing to sample psychology slightly different estimates. Probability theory tells three samples.
considered for selection in some stage of sampling.

us about the distribution of estimates that would Moreover, the 45 samples are not evenly dis-
In Chapter 9, on survey research, we'll see how
computers are used to select random telephone
be produced by a large number of such samples. tributed, as they were when the sample size was
To see how this works, we'll look at two examples only one. Rather, they are somewhat clustered
population The theoretically specified aggregation of
numbers for interviewing, a technique called
random-digit dialing.
of sampling distributions, beginning with a simple around the true value of $4.50. Only two possible
the elements in a study.
The reasons for using random selection meth-
example in which our population consists of just samples deviate by as much as $4 from the true
study population That aggregation of elements from
ods are twofold. First, this procedure serves as a
which a sample is actually selected. ten cases.
check on conscious or unconscious bias on the part *I want to thank Hanan Selvin for suggesting this
random selection A sampling method in which each
method of introducing probability sampling.
The Sampling Distribution of Ten Cases
of the researcher. The researcher who selects cases
element has an equal chance of selection independent
of any other event in the selection process. on an intuitive basis might very well select cases
that would support his or her research expectations Suppose there are ten people in a group, and each
sampling unit That element or set of elements con- parameter The summary description of a given vari-
sidered for selection in some stage of sampling. or hypotheses. Random selection erases this dan- has a certain amount of money in his or her pocket. able in a population.
ger. More important, random selection offers access To simplify, let's assume that one person has no

The Theory and Logic of Probability Sampling . 193

FIGURE 7-4 FIGURE 7-6


A Population of Ten People with $0-$9 The Sampling Distribution of Samples of 2

10
True mean = $4.50
9

8
a
m
n 7
E W
my
6
o
5
E 4
Z 3

$0 $1 $2 $3 $4 $5 $6 $7 $8 $9
Estimate of mean
(Sample size = 2)
FIGURE 7-5

The Sampling Distribution of Samples of 1

value ([$0 $1] and [$8 $9]), whereas five of the Sampling Distribution and
Estimates of Sampling Error
samples would give the true estimate of $4.50; an-
other eight samples miss the mark by only 50 cents
10 (plus or minus). Let's turn now to a more realistic sampling situation
True mean = $4.50
Now suppose we select even larger samples. involving a much larger population and see how
9
What do you suppose that will do to our estimates the notion of sampling distribution applies. Assume
N 8
of the mean? Figure 7-7 presents the sampling dis- that we wish to study the student population of
O.
E
E m 7 tributions of samples of 3, 4, 5, and 6. State University (SU) to determine the percentage
6 The progression of sampling distributions is of students who approve or disapprove of a student

mo 5 dear. Every increase in sample size improves the conduct code proposed by the administration. The
Ea t distribution of estimates of the mean. The limiting study population will be the aggregation of, say,
4
z case in this procedure, of course, is to select a 20,000 students contained in a student roster: the
3 sample of ten. There would be only one possible sampling frame. The elements will be the individ-
2 sample (everyone) and it would give us the true ual students at SU. We'll select a random sample of,

1 4
mean of $4.50. As we'll see shortly, this principle say, 100 students for the purposes of estimating the
applies to actual sampling of meaningful popula- entire student body. The variable under consider-
tions. The larger the sample selected, the more ac- ation will be attitudes toward the code, a binomial vari-
$0 $1 $2 $3 $4 $5 $6 $7 $8 $9
curate it is as an estimation of the population from able: approve and disapprove. ( The logic of probability
Estimate of mean
(Sample size = 1) which it was drawn. sampling applies to the examination of other types

The Theory and Logic of Probability Sampling . 195


FIGURE 7-7
The Sampling Distributions of Samples of 3, 4, 5, and 6
FIGURE 7-8
Range of Possible Sample Study Results
True mean = $4.50
20
19 B. Samples of 4
18 I I
17 50 100

a
0

I
N 16
Percent of students approving of the student code
n 15
True mean = $4.50 14
0 X13
b. A. Samples of 3
w w12
a 0.11 FIGURE 7-9

..............
E 10 E10 Results Produced by Three Hypothetical Studies

................
w 9
9

..................
0 8 0 8

E5

......................
E
7 7

6
Z 5 Z
6
5

4 4
3 3
2 2
1 1

$0 $1 $2 $3 $4 $5 $6 $7 $8 $9 $1 $2 $3 $4 $5 $6 $7 $8 $9
Estimate of mean Estimate of mean
(Sample size = 3) (Sample size = 4)

True mean = $4.50 True mean = $4.50


of variables, such as mean income, but the compu- sure their approval or disapproval of the student
C. Samples of 5
20 20 tations are somewhat more complicated. Conse- code. Perhaps 51 students in the second sample ap-
19 19 D. Samples of 6 Jr quently, this introduction focuses on binomials.) prove of the code. We place another dot in the ap-
18 18 The horizontal axis of Figure 7-8 presents all propriate place on the x axis. Repeating this process
,..,17 . 17 once more, we may discover that 52 students in the
possible values of this parameter in the population-
a, 16 16
N from 0 percent to 100 percent approval. The mid- third sample approve of the code.
15 `" 15
Figure 7-9 presents the three different sample
m 14 m 14 point of the axis-50 percent-represents half the
F 13 X13 students approving of the code and the other half statistics representing the percentages of students
w 12 w12 disapproving. in each of the three random samples who approved
0.11 0.11 of the student code. The basic rule of random sam-
To choose our sample, we give each student on
E 10 E 10
9
m the student roster a number and select 100 random pling is that such samples drawn from a popula-
0 0 tion give estimates of the parameter that exists in
8 numbers from a table of random numbers. Then

S
D we interview the 100 students whose numbers the total population. Each of the random samples,
E
have been selected and ask for their attitudes to- then, gives us an estimate of the percentage of stu-
z 6
z
5 dents in the total student body who approve of
ward the student code: whether they approve or
4
disapprove. Suppose this operation gives us 48 stu- the student code. Unhappily, however, we have
3
dents who approve of the code and 52 who disap- selected three samples and now have three sepa-
2
1 prove. This summary description of a variable in a rate estimates.
sample is called a statistic. We present this statistic
$0 $1 $2 $3 $4 $5 $6 $7 $8 $9 $1 $2 $3 $4 $5 $6 $7 $8 $9
by placing a dot on the x axis at the point repre-
Estimate of mean Estimate of mean senting 48 percent.
(Sample size = 5) (Sample size = 6) statistic The summary description of a variable in a
Now let's suppose we select another sample of sample, used to estimate a population parameter.
100 students in exactly the same fashion and mea-

The Theory and Logic of Probability Sampling 197


196 . Chapter 7: The Logic of Sampling

FIGURE 7-10
In probability theory, the standard error is a P = .8, PQ = . 16; if P = .99, PQ = .0099. By exten-
valuable piece of information because it indicates sion, if P is either 0.0 or 1.0 (either 0 percent or
The Sampling Distribution
the extent to which the sample estimates will be 100 percent approve of the student code), the stan-
distributed around the population parameter. (If dard error will be 0. If everyone in the population
you're familiar with the standard deviation in statis- has the same attitude (no variation), then every
tics, you may recognize that the standard error, in sample will give exactly that estimate.
this case, is the standard deviation of the sampling The standard error is also a function of the
distribution.) Specifically, probability theory indi- sample size-an inverse function. As the sample
...............
cates that certain proportions of the sample esti- size increases, the standard error decreases. As the
80
mates will fall within specified increments-each sample size increases, the several samples will be
60
40 equal to one standard error-from the population clustered nearer to the true value. Another general
20 parameter. Approximately 34 percent (.3413) of guideline is evident in the formula: Because of the
r
0 I the sample estimates will fall within one standard
100 square root formula, the standard error is reduced
0 50
error increment above the population parameter, by half if the sample size is quadrupled. In our
Percent of students approving of the student code
and another 34 percent will fall within one stan- present example, samples of 100 produce a stan-
dard error below the parameter. In our example, dard error of 5 percent; to reduce the standard er-
the standard error increment is 5 percent, so we ror to 2.5 percent, we must increase the sample
know that 34 percent of our samples will give esti- size to 400.
To retrieve ourselves from this problem, let's cent than elsewhere in the graph. Probability theory
mates of student approval between 50 percent (the All of this information is provided by estab-
draw more and more samples of 100 students each, tells us, then, that the true value is in the vicinity of
parameter) and 55 percent (one standard error lished probability theory in reference to the selec-
question each of the samples concerning their ap- 50 percent.
above); another 34 percent of the samples will give tion of large numbers of random samples. (If you've
proval or disapproval of the code, and plot the new Second, probability theory gives us a formula
estimates between 50 percent and 45 percent (one taken a statistics course, you may know this as the
sample statistics on our summary graph. In draw- for estimating how closely the sample statistics are
standard error below the parameter). Taken to- Central Tendency Theorem.) If the population pa-
ing many such samples, we discover that some of clustered around the true value. To put it another
gether, then, we know that roughly two-thirds rameter is known and many random samples are
the new samples provide duplicate estimates, as in way, probability theory enables us to estimate the
(68 percent) of the samples will give estimates selected, we can predict how many of the sample
the illustration of ten cases. Figure 7-10 shows the sampling error-the degree of error to be ex-
within 65 percent of the parameter. estimates will fall within specified intervals from
sampling distribution of, say, hundreds of samples. pected for a given sample design. This formula
Moreover, probability theory dictates that the parameter.
This is often referred to as a normal curve. contains three factors: the parameter, the sample
roughly 95 percent of the samples will fall within Recognize that this discussion illustrates only
Note that by increasing the number of samples size, and the standard error (a measure of sampling
plus or minus two standard errors of the true value, the logic of probability sampling; it does not de-
selected and interviewed, we have also increased error):
and 99.9 percent of the samples will fall within plus scribe the way research is actually conducted. Usu-
the range of estimates provided by the sampling
'P X Q or minus three standard errors. In our present ex- ally, we don't know the parameter: The very reason
operation. In one sense we have increased our di- = ample, then, we know that only one sample out
n we conduct a sample survey is to estimate that
lemma in attempting to guess the parameter in the
S

of a thousand would give an estimate lower than value. Moreover, we don't actually select large
population. Probability theory, however, provides
The symbols P and Q in the formula equal the 35 percent approval or higher than 65 percent. numbers of samples: We select only one sample.
certain important rules regarding the sampling dis-
population parameters for the binomial: If 60 per- The proportion of samples falling within one, Nevertheless, the preceding discussion of probabil-
tribution presented in Figure 7-10.
cent of the student body approve of the code and two, or three standard errors of the parameter is ity theory provides the basis for inferences about
First, if many independent random samples are
40 percent disapprove, P and Q are 60 percent and constant for any random sampling procedure such the typical social research situation. Knowing what
selected from a population, the sample statistics pro-
40 percent, respectively, or .6 and .4. Note that as the one just described, providing that a large it would be like to select thousands of samples al-
vided by those samples will be distributed around
Q = I - P and P = 1 - Q. The symbol n equals the number of samples are selected. The size of the lows us to make assumptions about the one sample
the population parameter in a known way. Thus,
number of cases in each sample, and s is the stan- standard error in any given case, however, is a we do select and study.
although Figure 7-10 shows a wide range of esti-
dard error. function of the population parameter and the
mates, more of them are in the vicinity of 50 per-
Let's assume that the population parameter in sample size. If we return to the formula for a mo-
ment, we note that the standard error will increase Confidence Levels and Confidence Intervals
the student example is 50 percent approving of the
sampling error The degree of error to be expected in code and 50 percent disapproving. Recall that we've as a function of an increase in the quantity P times Whereas probability theory specifies that 68 per-
probability sampling. The formula for determining Q. Note further that this quantity reaches its maxi- cent of that fictitious large number of samples would
been selecting samples of 100 cases each. When
sampling error contains three factors: the parameter,
these numbers are put into the formula, we find mum in the situation of an even split in the popu- produce estimates falling within one standard error
the sample size, and the standard error.
that the standard error equals .05, or 5 percent. lation. If P = . 5, PQ = . 25; if P = .6, PQ = .24; if of the parameter, we can turn the logic around and
Populations and Sampling Frames . 199
198 . Chapter7: The Logic of Sampling

infer that any single random sample estimate has a The logic of confidence levels and confidence Nevertheless, the calculations discussed in this The data reported in this paper ... were
68 percent chance of falling within that range. This intervals also provides the basis for determining the section can be extremely valuable to you in under- gathered from a probability sample of adults
standing and evaluating your data. Although the cal- aged 18 and over residing in households in the
observation leads us to the two key components of appropriate sample size for a study. Once you've
sampling error estimates: confidence level and decided on the degree of sampling error you can culations do not provide as precise estimates as some 48 contiguous United States. Personal interviews
confidence interval. We express the accuracy of tolerate, you'll be able to calculate the number of researchers might assume, they can be quite valid with 1,914 respondents were conducted by
cases needed in your sample. Thus, for example, if for practical purposes. They are unquestionably the Survey Research Center of the University
our sample statistics in terms of a level of confidence
that the statistics fall within a specified interval from you want to be 5 percent confident that your study more valid than less rigorously derived estimates of Michigan during the fall of 1975.
the parameter. For example, we may say we are findings are accurate within plus or minus 5 per- based on less-rigorous sampling methods. Most im- (Jackman and Senter 1980:345)

95 percent confident that our sample statistics (for centage points of the population parameters, you portant, being familiar with the basic logic underly-
Properly drawn samples provide information
example, 50 percent favor the new student code) should select a sample of at least 400. (Appendix F ing the calculations can help you react sensibly both
appropriate for describing the population of ele-
are within plus or minus 5 percentage points of the is a convenient guide in this regard.) to your own data and to those reported by others.
ments composing the sampling frame-nothing
population parameter. As the confidence interval This, then, is the basic logic of probability sam- more. I emphasize this point in view of the all-too-
is expanded for a given statistic, our confidence pling. Random selection permits the researcher to common tendency for researchers to select samples
increases. For example, we may say that we are link findings from a sample to the body of probabil-
Populations and Sampling Frames from a given sampling frame and then make asser-
99.9 percent confident that our statistic falls within ity theory so as to estimate the accuracy of those tions about a population similar to, but not identi-
three standard errors of the true value. findings. All statements of accuracy in sampling The preceding section introduced the theoretical
cal to, the population defined by the sampling
must specify both a confidence level and a model for social research sampling. Although as
Although we may be confident (at some level) frame.
of being within a certain range of the parameter, confidence interval. The researcher must report students, research consumers, and researchers we
For example, take a look at this report, which
we've already noted that we seldom know what that he or she is x percent confident that the popu- need to understand that theory, it is no less impor-
discusses the drugs most frequently prescribed by
the parameter is. To resolve this problem, we sub- lation parameter is between two specific values. tant to appreciate the less-than-perfect conditions
U.S. physicians:
stitute our sample estimate for the parameter in the Two cautions are in order before we conclude that exist in the field. In this section we'll look at
formula; that is, lacking the true value, we substi- this discussion of the basic logic of probability sam- one aspect of field conditions that requires a com- Information on prescription drug sales is not

tute the best available guess. pling. First, the survey uses of probability theory as promise with idealized theoretical conditions and easy to obtain. But Rinaldo V DeNuzzo, a

The result of these inferences and estimations is discussed here are technically not wholly justified. assumptions: the congruence of or disparity be- professor of pharmacy at the Albany College

that we can estimate a population parameter and The theory of sampling distribution makes assump- tween populations of sampling frames. of Pharmacy, Union University, Albany, NY,

also the expected degree of error on the basis of tions that almost never apply in survey conditions. Simply put, a sampling frame is the list or has been tracking prescription drug sales for

one sample drawn from a population. Beginning The exact proportion of samples contained within quasi list of elements from which a probability 25 years by polling nearby drugstores. He pub-

with the question 'What percentage of the student specified increments of standard errors, for ex- sample is selected. If a sample of students is se- lishes the results in an industry trade magazine,

body approves of the student code?' you could se- ample, mathematically assumes an infinitely large lected from a student roster, the roster is the sam- MMP}M.
population, an infinite number of samples, and pling frame. If the primary sampling unit for a DeNuzzo's latest survey, covering 1980, is
lect a random sample of 100 students and inter-
complex population sample is the census block, based on reports from 66 pharmacies in 48 com-
view them. You might then report that your best sampling with replacement-that is, every sam-
the list of census blocks composes the sampling munities in New York and New Jersey. Unless
estimate is that 50 percent of the student body ap- pling unit selected is "thrown back into the pot"
frame-in the form of a printed booklet, a mag- there is something peculiar about that part of
proves of the code and that you are 95 percent and could be selected again. Second, our discussion
netic tape file, or some other computerized record. the country, his findings can be taken as repre-
confident that between 40 and 60 percent (plus or has greatly oversimplified the inferential jump from
Here are some reports of sampling frames appear- sentative of what happens across the country.
minus two standard errors) approve. The range the distribution of several samples to the probable
ing in research journals. In each example I've itali- (Moskowitz 1981:33)
from 40 to 60 percent is the confidence interval. characteristics of one sample.
cized the actual sampling frames.
(At the 68 percent confidence level, the confidence I offer these cautions to provide perspective on What is striking in the excerpt is the casual
interval would be 45-55 percent.) the uses of probability theory in sampling. Social The data for this research were obtained from a comment about whether there is anything pecu-
researchers often appear to overestimate the preci- liar about New York and New Jersey. There is. The
random sample of parents of children in the third
sion of estimates produced by the use of probability
grade in public and parochial schools in Yakima
theory. As I'll mention elsewhere in this chapter County, Washington.
confidence level The estimated probability that a and throughout the book, variations in sampling (Petersen and Maynard 1981:92)
population parameter lies within a given confidence in- sampling frame That list or quasi list of units com-
techniques and nonsampling factors may further
terval. Thus, we might be 95 percent confident that be- posing a population from which a sample is selected. if
The sample at Time I consisted of 160 names
reduce the legitimacy of such estimates. For ex- the sample is to be representative of the population, it
tween 35 and 45 percent of all voters favor Candidate A.
ample, those selected in a sample who fail or refuse drawn randomly from the telephone directory of
is essential that the sampling frame include all (or
confidence interval The range of values within
to participate further detract from the representa- Lubbock, Texas. nearly all) members of the population.
which a population parameter is estimated to lie.
(Tan 1980:242)
tiveness of the sample.
Types of Sampling Designs . 201
200 . Chapter7: The Logic of Sampling

lifestyle in these two states hardly typifies the other and then subsampling the membership lists of discover that the voters have not acted according to have equal representation in the frame. Typi-

48. We cannot assume that residents in these large, those churches selected. (More about that later.) the expected class biases. The ultimate disadvantage cally, each element should appear only once.
Other lists of individuals may be especially rele- of this method, then, is the researcher's inability to Elements that appear more than once will have
urbanized, eastern seaboard states necessarily have
the same drug-use patterns as do residents of Mis- vant to the research needs of a particular study. Gov- estimate the degree of error to be expected in the a greater probability of selection, and the sample
sissippi, Nebraska, or Vermont. ernment agencies maintain lists of registered voters, sample findings. will, overall, overrepresent those elements.
Does the survey even represent prescription pat- for example, that might be used if you wanted to Street directories and tax maps are often used
Other, more practical matters relating to popu-
terns in New York and New Jersey? To determine conduct a preelection poll or an in-depth examina- for easy samples of households, but they may also
lations and sampling frames will be treated else-
that, we would have to know something about the tion of voting behavior-but you must insure that suffer from incompleteness and possible bias. For
where in this book. For example, the form of the
way the 48 communities and the 66 pharmacies the list is up-to-date. Similar lists contain the names example, in strictly zoned urban regions, illegal
sampling frame-such as a list in a publication,
were selected. We should be wary in this regard, in of automobile owners, welfare recipients, taxpay- housing units are unlikely to appear on official
a 3-by-5 card file, computer disks, or magnetic
view of the reference to "polling nearby drugstores." ers, business permit holders, licensed professionals, records. As a result, such units could not be se-
tapes-can affect how easy it is to use. And ease
As we'll see, there are several methods for selecting and so forth. Although it may be difficult to gain lected, and sample findings could not be represen-
of use may often take priority over scientific con-
samples that ensure representativeness, and unless access to some of these lists, they provide excellent tative of those units, which are often poorer and
siderations: An "easier" list may be chosen over a
they're used, we shouldn't generalize from the sampling frames for specialized research purposes. more crowded than the average.
"harder" one, even though the latter is more ap-
study findings. Realizing that the sampling elements in a study Though the preceding comments apply to the
propriate to the target population. We should not
A sampling frame, then, must be consonant need not be individual persons, we may note that United States, the situation is quite different in some
take a dogmatic position in this regard, but every
with the population we wish to study. In the sim- the lists of other types of elements also exist: uni- other countries. In Japan, for example, the govern-
researcher should carefully weigh the relative ad-
plest sample design, the sampling frame is a list of versities, businesses of various types, cities, aca- ment maintains quite accurate population registra-
vantages and disadvantages of such alternatives.
the elements composing the study population. In demic journals, newspapers, unions, political dubs, tion lists. Moreover, citizens are required by law to
practice, though, existing sampling frames often professional associations, and so forth. keep their information up-to-date, such as changes
define the study population rather than the other Telephone directories are frequently used for in residence or births and deaths in the household.
way around. That is, we often begin with a popula- "quick and dirty" public opinion polls. Undeniably As a consequence, you can select simple random Types of Sampling Designs
tion in mind for our study; then we search for pos- they're easy and inexpensive to use-no doubt the samples of the population more easily in Japan
Up to this point, we've focused on simple random
sible sampling frames. Having examined and evalu- reason for their popularity. And, if you want to than in the United States. Such a registration list in
sampling (SRS). Indeed, the body of statistics typi-
ated the frames available for our use, we decide make assertions about telephone subscribers, the the United States would conflict directly with this
cally used by social researchers assumes such a
which frame presents a study population most ap- directory is a fairly good sampling frame. (Realize, country's norms regarding individual privacy.
sample. As you'll see shortly, however, you have
propriate to our needs. of course, that a given directory will not include
several options in choosing your sampling method,
new subscribers or those who have requested un-
Review of Populations and Sampling Frames
Studies of organizations are often the simplest and you'll seldom if ever choose simple random
from a sampling standpoint because organizations listed numbers. Sampling is further complicated by
sampling. There are two reasons for this. First, with
typically have membership lists. In such cases, the the directories' inclusion of nonresidential listings.) Because social research literature gives surprisingly all but the simplest sampling frame, simple random
list of members constitutes an excellent sampling Unfortunately, telephone directories are all too of- little attention to the issues of populations and sam- sampling is not feasible. Second, and probably sur-
frame. If a random sample is selected from a mem- ten used as a listing of a city's population or of its pling frames, I've devoted special attention to them prisingly, simple random sampling may not be the
bership list, the data collected from that sample voters. Of the many defects in this reasoning, the here. Here is a summary of the main guidelines to
most accurate method available. Let's turn now to a
may be taken as representative of all members- chief one involves a social-class bias, as we have remember: discussion of simple random sampling and the other
if all members are included in the list. seen. Poor people are less likely to have telephones;
1. Findings based on a sample can be taken as options available.
Populations that can be sampled from good or- rich people may have more than one line. A tele-
representing only the aggregation of elements
ganizational lists include elementary school, high phone directory sample, therefore, is likely to have
Simple Random Sampling
that compose the sampling frame.
school, and university students and faculty; church a middle- or upper-lass bias.
2. Often, sampling frames do not truly include all
members; factory workers; fraternity or sorority The class bias inherent in telephone directory
As noted, simple random sampling is the basic
the elements their names might imply. Omis-
members; members of social, service, or political samples is often hidden. Preelection polls conducted
sampling method assumed in the statistical compu-
in this fashion are sometimes quite accurate, per- sions are almost inevitable. Thus, a first con-
dubs; and members of professional associations. tations of social research. Because the mathematics
haps because of the lass bias evident in voting it- cern of the researcher must be to assess the ex-
The preceding comments apply primarily to lo-
tent of the omissions and to correct them if
cal organizations. Often, statewide or national or- self: Poor people are less likely to vote. Frequently,
then, these two biases nearly coincide, so that the re- possible. (Of course, the researcher may feel simple random sampling A type of probability sam-
ganizations do not have a single membership list.
that he or she can safely ignore a small number pling in which the units composing a population are
There is, for example, no single list of Episcopalian sults of a telephone poll may come very dose to the
assigned numbers. A set of random numbers is then
of omissions that cannot easily be corrected.)
church members. However, a slightly more com- final election outcome. Unhappily, you never know generated, and the units having those numbers are in-
plex sample design could take advantage of local for sure until after the election. And sometimes, as 3. To be generalized even to the population com- cluded in the sample.
church membership lists by first sampling churches in the case of the 1936 Literary Digest poll, you may posing the sampling frame, all elements must
202 . Chapter 7: The Logic of Sampling

need to select five-digit numbers.) Thus, middle three digits, 048, and any of these have selected 399 as our first random
we want to select 100 random numbers plans would work.) They key is to make a number, and we have 99 more to go. Mov-
in the range from 001 to 980. plan and stick with it. For convenience, i ng down the second column, we select
social research, it's often appropriate to
3. Now turn to the first page of Appendix C. let's use the left-most three digits. 069, 729, 919,143, 368, 695,409, 939, and
select a set of random numbers from a table
In

Notice there are several rows and columns 5. We can also choose to progress through so forth. At the bottom of column 2 (on
such as the one in Appendix C. Here's how to
of five-digit numbers, and there are sev- the tables any way we want: down the the second page of this table), we select
do that.
eral pages.The table represents a series of columns, up them, across to the right or number 017 and continue to the top of
Suppose you want to select a simple ran-
random numbers in the range from 00001 to the left, or diagonally. Again, any of column 3:015,255, and so on.
dom sample of 100 people (or other units)
to 99999.To use the table for your hypo- these plans will work just fine as long as 8. See how easy it is? But trouble lies ahead.
out of a population totaling 980.
thetical sample, you have to answer these we stick to it. For convenience, let's agree When we reach column 5, we are speed-
1. To begin, number the members of the questions: to move down the columns. When we get ing along, selecting 816, 309, 763, 078,
population: in this case, from 1 to 980. a. How will you create three-digit num- to the bottom of one column, we'll go to 061, 277, 988 ... Wait a minute! There are
Now the problem is to select 100 random bers out of five-digit numbers? the top of the next; when we exhaust a only 980 students in the senior class.
numbers. Once you've done that, your b. What pattern will you follow in mov- given page, we'll start at the top of the How can we pick number 988? The solu-
sample will consist of the people having i ng through the table to select your first column of the next page. tion is simple: Ignore it. Any time you
the numbers you've selected. (Note: It's numbers? 6. Now, where do we start? You can close come across a number that lies outside
not essential to actually number them, as c. Where will you start? your eyes and stick a pencil into the table your range, skip it and continue on your
l ong as you're sure of the total. If you have Each of these questions has several sat- and start wherever the pencil point lands. way: 188,174, and so forth.The same so-
them in a list, for example, you can always isfactory answers.The key is to create a (I know it doesn't sound scientific, but it lution applies if the same number comes
count through the list after you've se- plan and follow it. Here's an example. works.) Or, if you're afraid you'll hurt the up more than once. If you select 399
l ected the numbers.) 4. To create three-digit numbers from five- book or miss it altogether, close your eyes again, for example,just ignore it the sec-
2. The next step is to determine the number digit numbers, let's agree to select five- and make up a column number and a row ond time.
of digits you'll need in the random num- digit numbers from the table but consider number. ("I'll pick the number in the fifth 9. That's it.You keep up the procedure until
bers you select. In our example, there are only the left-most three digits in each row of column 2.") Start with that number. you've selected 100 random numbers. Re-
980 members of the population, so you'll case. If we picked the first number on the 7. Let's suppose we decide to start with the turning to your list your sample consists
need three-digit numbers to give every- first page-10480-we would only con- fifth number in column 2. If you look on of person number 399, person number 69,
one a chance of selection. (If there were sider the 104. (We could agree to take the first page of Appendix C, you'll see person number 729, and so forth.
11,825 members of the population, you'd the digits farthest to the right, 480, or the that the starting number is 39975. We

of random sampling are especially complex, we'll simple random sample can be selected automati- 100 would have been selected if "00" had come up in the sample. If the list contained 10,000 elements
detour around them in favor of describing the ways cally by computer. (In effect, the computer program in the list.) and you wanted a sample of 1,000, you would se-
of employing this method in the field. numbers the elements in the sampling frame, gen- lect every tenth element for your sample. To ensure
Once a sampling frame has been properly es-
SystematikSampiing
erates its own series of random numbers, and prints
tablished, to use simple random sampling the re- out the list of elements selected.)
systematic sampling A type of probability sampling
searcher assigns a single number to each element Figure 7-11 offers a graphic illustration of simple Simple random sampling is seldom used in prac-
in which every kth unit in a list is selected for inclusion
in the list, not skipping any number in the process. random sampling. Note that the members of our tice. As you'll see, it's not usually the most efficient in the sample-for example, every 25th student in the
A table of random numbers (Appendix C) is then hypothetical micropopulation have been numbered method, and it can be laborious if done manually. college directory of students. You compute k by divid-
used to select elements for the sample. The box en- from 1 to 100. Moving to Appendix C, we decide to Typically, simple random sampling requires a list of ing the size of the population by the desired sample
titled "Using a Table of Random Numbers" explains use the last two digits of the first column and to be- size; k is called the sampling interval. Within certain con-
elements. When such a list is available, researchers
straints, systematic sampling is a functional equivalent
its use. gin with the third number from the top. This yields usually employ systematic sampling instead.
of simple random sampling and usually easier to do.
If your sampling frame is in a machine-readable person number 30 as the first one selected into the In systematic sampling, every kth element in Typically, the first unit is selected at random.
form, such as computer disk or magnetic tape, a sample. Number 67 is next, and so forth. (Person the total list is chosen (systematically) for inclusion

204 . Chapter 7: The Logic of Sampling Types of Sampling Designs . 205

FIGURE 7-11
A Simple Random Sample
In practice, systematic sampling is virtually iden- that list. If the elements are arranged in any partic-
tical to simple random sampling. If the list of ele- ular order, you should figure out whether that or-
ments is indeed randomized before sampling, one der will bias the sample to be selected, then you
might argue that a systematic sample drawn from should take steps to counteract any possible bias
Appendix C
Table of Random Numbers that list is in fact a simple random sample. By now, (for example, take a simple random sample from
debates over the relative merits of simple random cyclical portions).
10480 15011 01536 sampling and systematic sampling have been re- Usually, however, systematic sampling is usu-
e3~
22368 46573 25595
e'r IASI ~7 8 A 241 solved largely in favor of the latter, simpler method. ally superior to simple random sampling, in con-

It
3 1 48360 22527
4 0 9 12 421 93093 06243 Empirically, the results are virtually identical. And, venience if nothing else. Problems in the ordering

t
It
It - 16 A
A 11 37 39975 81837 as you'll see in a later section, systematic sampling, of elements in the sampling frame can usually be
it
0 : 17 '15 14
It 1+-
1 in some instances, is slightly more accurate than remedied quite easily.

e'a ,
22 19 779 06907 11008 simple random sampling.
3 ~ 995 72905 56420

It
l; 23 ~5 26 A 29 There is one danger involved in systematic
i3 2 1 30 963 91977 05463

A
28

I All
24
e I
38
A

ii
895
854
14342
36857
63661
53342
sampling. The arrangement of elements in the list
can make systematic sampling unwise. Such an ar-
Stratified Sampling
So far we have discussed two methods of sample

~
45 A 37 36 rangement is usually called periodicity. If the list of
42 A 40 39 ~ s~
47 289 69578 88231 selection from a list: random and systematic.
46 11 ? It It elements is arranged in a cyclical pattern that coin-
~ A
It 41 635 40961 48235 Stratification is not an alternative to these meth-
cides with the sampling interval, a grossly biased

t
I55 6 09429 93969 52636 ods; rather, it represents a possible modification of
01 A

It
a

73 It
72 51 - 5 sample may be drawn. Here are two examples that
A i1 72 48 49 t5 3 54 ~~ ~ their use.

i
74 56 58 i illustrate this danger.
Simple random sampling and systematic sam-
/t e3
11 A 66 ~ 59 The In a dassic study of soldiers during World War II,
A 01 pling both ensure a degree of representativeness

It
71 68 15
Q+ sample the researchers selected a systematic sample from
65 61 and permit an estimate of the error present. Strati-
)1 Il
75 7g 89 673 unit rosters. Every tenth soldier on the roster was
6A4 fied sampling is a method for obtaining a greater
It /t A j 11 63
77 82 65 It
667, 82 selected for the study. The rosters, however, were
degree of representativeness by decreasing the prob-
s0. e ~ as
A
arranged in a table of organizations: sergeants first,
g /I
a~ 84 Il
89
~ A II able sampling error. To understand this method, we
99 then corporals and privates, squad by squad. Each

9
u 9,2
oi~ must return briefly to the basic theory of sampling
62W squad had ten members. As a result, every tenth

e It
distribution.
92 901
A 89
1 /t 9 / 96 '
94 11 person on the roster was a squad sergeant. The sys-
95 8 Recall that sampling error is reduced by two
Il 118 tematic sample selected contained only sergeants.
11 1 factors in the sample design. First, a large sample
Il It could, of course, have been the case that no ser-
produces a smaller sampling error than does a small
geants were selected for the same reason.
sample. Second, a homogeneous population pro-
As another example, suppose we select a sample
duces samples with smaller sampling errors than
of apartments in an apartment building. If the
does a heterogeneous population. If 99 percent of
sample is drawn from a list of apartments arranged
the population agrees with a certain statement, it's
in numerical order (for example, 101, 102, 103, extremely unlikely that any probability sample will
against any possible human bias in using this systematic sample with a random start. Two terms are 104, 201, 202, and so on), there is a danger of the
greatly misrepresent the extent of agreement. if the
method, you should select the first element at ran- frequently used in connection with systematic sam- sampling interval coinciding with the number of
population is split 50-50 on the statement, then
dom. Thus, in the preceding example, you would pling. The sampling interval is the standard dis- apartments on a floor or some multiple thereof.
the sampling error will be much greater.
begin by selecting a random number between one tance between elements selected in the sample: ten Then the samples might induce only northwest-
Stratified sampling is based on this second fac-
and ten. The element having that number is in- in the preceding sample. The sampling ratio is the comer apartments or only apartments near the ele-
tor in sampling theory. Rather than selecting your
cluded in the sample, plus every tenth element fol- proportion of elements in the population that are vator. If these types of apartments have some other
lowing it. This method is technically referred to as a selected: 1/10 in the example. particular characteristic in common (for example,
higher rent), the sample will be biased. The same stratification The grouping of the units composing a
tion size danger would appear in a systematic sample of population into homogeneous groups (or strata) before
sampling interval The standard distance between sampling interval = popula sampling. This procedure, which may be used in con-
elements selected from a population for a sample. sample size houses in a subdivision arranged with the same
junction with simple random, systematic, or duster
sampling ratio The proportion of elements in the number of houses on a block. sampling, improves the representativeness of a sample,
population that are selected to be in a sample. l e size In considering a systematic sample from a list, at least in terms of the stratification variables.
sampling ratio = samp
population size then, you should carefully examine the nature of

206 . Chapter7: The Logic ofSampling Types of Sampling Designs . 207

sample from the total population at large, the re- available for stratification. Geographical location FIGURE 7-12
searcher ensures that appropriate numbers of ele- within a city, state, or nation is related to many
A Stratified, Systematic Sample with a Random Start
ments are drawn from homogeneous subsets of that things. Within a city, stratification by geographical
population. To get a stratified sample of university location usually increases representativeness in so-
Random start
students, for example, you would first organize cial class, ethnic group, and so forth. Within a na-
your population by college class and then draw ap- tion, it increases representativeness in a broad range y

~dBdhddd~dd~~~~d~edd
propriate numbers of freshmen, sophomores, ju- of attitudes as well as in social class and ethnicity.
The
niors, and seniors. In a nonstratified sample, repre- When you're working with a simple list of all
sample
sentation by class would be subjected to the same elements in the population, two methods of stratifi- 2 3 4 5 6 7 8 9 10 11 12
1 14 15 16 17 18 19 20 13
sampling error as would other variables. In a sample cation predominate. In one method, you sort the 3W 13a
stratified by class, the sampling error on this vari- population elements into discrete groups based on
able is reduced to zero. whatever stratification variables are being used. On
40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 23A
More-complex stratification methods are also the basis of the relative proportion of the popula-
possible. In addition to stratifying by class, you tion represented by a given group, you select-
might also stratify by gender, by GPA, and so forth. randomly or systematically-several elements
43 53 ~
In this fashion you might be able to ensure that from that group constituting the same proportion
your sample would contain the proper numbers of of your desired sample size. For example, if sopho- 01
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

male sophomores with a 3.5 average, of female more men with a 4.0 average compose 1 percent
sophomores with a 4.0 average, and so forth. of the student population and you desire a sample
,
log
The ultimate function of stratification, then, is of 1,000 students, you would select 10 sophomore
80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61
to organize the population into homogeneous sub- men with a 4.0 average.

44 * 4 u 444 At At t A
sets (with heterogeneity between subsets) and to The other method is to group students as de-
select the appropriate number of elements from scribed and then put those groups together in a
each. To the extent that the subsets are homoge- continuous list, beginning with all freshmen men 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
a, s
neous on the stratification variables, they may be with a 4.0 average and ending with all senior
homogeneous on other variables as well. Because women with a 1.0 or below. You would then select
age is related to college class, a sample stratified by a systematic sample, with a random start, from the
class will be more representative in terms of age as entire list. Given the arrangement of the list, a
well, compared with an unstratified sample. Be- systematic sample would select proper numbers
Implicit Stratification in ber, providing a rough stratification by geographical

Systematic Sampling
cause occupational aspirations still seem to be re- (within an error range of 1 or 2) from each sub- origin.
lated to gender, a sample stratified by gender will group. (Note: A simple random sample drawn An ordered list of elements, therefore, may be
be more representative in terms of occupational from such a composite list would cancel out the I mentioned that systematic sampling can, under more useful to you than an unordered, randomized
aspirations. stratification.) certain conditions, be more accurate than simple list. I've stressed this point in view of the unfortu-
The choice of stratification variables typically Figure 7-12 offers a graphic illustration of random sampling. This is the case whenever the ar- nate belief that lists should be randomized before
depends on what variables are available. Gender stratified, systematic sampling. As you can see, we rangement of the list creates an implicit stratifica- systematic sampling. Only if the arrangement pre-
can often be determined in a list of names. Univer- lined up our micropopulation according to gender tion. As already noted, if a list of university students sents the problems discussed earlier should the list
sity lists are typically arranged by class. Lists of fac- and race. Then, beginning with a random start of is arranged by class, then a systematic sample pro- be rearranged.
ulty members may indicate their departmental af- "3," we've taken every tenth person thereafter: vides a stratification by class where a simple ran-

Illustration: Sampling University Students


filiation. Government agency files may be arranged 3, 13, 23,..., 93. dom sample would not.
by geographical region. Voter registration lists are Stratified sampling ensures the proper repre- In a study of students at the University of
arranged according to precinct. sentation of the stratification variables; this, in turn, Hawaii, after stratification by school class, the stu- Let's put these principles into practice by looking at
In selecting stratification variables from among enhances the representation of other variables re- dents were arranged by their student identifica- an actual sampling design used to select a sample of
those available, however, you should be concerned lated to them. Taken as a whole, then, a stratified tion numbers. These numbers, however, were their university students. The purpose of the study was to
primarily with those that are presumably related to sample is more likely than a simple random sample social security numbers. The first three digits of survey, with a mail-out questionnaire, a represen-
variables you want to represent accurately. Because to be more representative on several variables. Al- the social security number indicate the state in tative cross section of students attending the main
gender is related to many variables and is often though the simple random sample is still regarded which the number was issued. As a result, within campus of the University of Hawaii. The following
available for stratification, it is often used. Educa- as somewhat sacred, it should now be dear that a class, students were arranged by the state in sections describe the steps and decisions involved in
tion is related to many variables, but it is often not you can often do better. which they were issued a social security num- selecting that sample.
208 . Chapter7: The Logic Df Sampling Multistage Cluster Sampling . 209

Study Population and Sampling Frame and 14; the student having that number and every

The obvious sampling frame available for use in


fourteenth student thereafter was selected in the
Sampling Santa's Fans
sample.
this sample selection was the computerized file
Once the sample had been selected, the com-
maintained by the university administration. The
puter was instructed to print each student's name resented in proportion to its population.
tape contained students' names, local and perma-
and mailing address on self-adhesive mailing labels. For each exchange, the telephone num-
nent addresses, and social security numbers, as well
These labels were then simply transferred to en- bers were formed by random digits, thus
as a variety of other information such as field of With the approach of Christmas 1985, the
velopes for mailing the questionnaires. permitting access to both listed and un-
study, class, age, and gender. New YorkTimes thought it would be inter-
listed residential numbers.
The computer database, however, contained esting to survey the nation's children regard-
Sample Modification After interviews with 1,358 adults
files on all people who could, by any conceivable i ng their beliefs in Santa Claus.There being were completed, parents were asked if
definition, be called students, many of whom This initial design of the sample had to be modified. no national registry of those who've been
their children could be interviewed on
seemed inappropriate to the purposes of the study. Before the mailing of questionnaires, the research- naughty and nice, the Times had to use some
the subject of Christmas. The results have
As a result, researchers needed to define the study ers discovered that unexpected expenses in the i ngenuity. Here's their description of what been weighted to take account of house-
population in a somewhat more restricted fashion. production of the questionnaires made it impos- they did:
hold size and number of residential tele-
The final definition included those 15,225 day- sible to cover the costs of mailing to all 1,100 stu-
The latest New YorkTimes Poll is based phones and to adjust for variations in the
program degree candidates who were registered for dents. As a result, one-third of the mailing labels
on telephone interviews conducted De- sample relation to region, race, sex, age
the fall semester on the Manoa campus of the uni- were systematically selected (with a random start)
cember 14-18 with 261 children aged 3 and education.
versity, including all colleges and departments, both for exclusion from the sample. The final sample for
through 10 around the United States, ex-
undergraduate and graduate students, and both the study was thereby reduced to 733 students. By the way, 87 percent of the children
cluding Alaska and Hawaii.
U.S. and foreign students. The computer program; I mention this modification to illustrate the fre- said they believed in Santa Claus: ranging
used for sampling, therefore, limited consideration quent need to alter a study plan in midstream. Be- from 96 percent among those 3-5 down to
The sample of telephone exchanges

to students fitting this definition. cause the excluded students were systematically 69 percent among the 9-1 0-year-olds.
called was selected by a computer from a

omitted from the initial systematic sample, the re-


complete list of exchanges in the country.
Swim Sara Rimer,"Poll Sees Lasdsl de for Santa: Of U.S.Chldeen, 97% Be-
Stratification maining 733 students could still be taken as rea-
The exchanges were chosen to ensure
liese,"NewYont rmws, December 24,1985.
sonably representing the study population. The re-
that each region of the country was rep-
The sampling program also permitted stratification
duction in sample size did, of course, increase the
of students before sample selection. The researchers
range of sampling error.
decided that stratification by college class would be such as all church members in the United States. In a more complex design, researchers might
sufficient, although the students might have been Often, however, the population elements are already sample blocks, list the households on each selected
further stratified within lass, if desired, by gender, grouped into subpopulations, and a list of those sub- block, sample the households, list the people re-
siding in each household, and, finally, sample the
Multistage Cluster Sampling
college, major, and so forth. populations either exists or can be created practi-
cally. For example, church members in the United people within each selected household. This multi-

Sample Selection The preceding sections have dealt with reasonably States belong to discrete churches, which are either stage sample design leads ultimately to a selec-
simple procedures for sampling from lists of ele- listed or could be. Following a cluster sample format, tion of a sample of individuals but does not re-
Once the students had been arranged by class, a
ments. Such a situation is ideal. Unfortunately, how- then, researchers could sample the list of churches quire the initial listing of all individuals in the city's
systematic sample was selected across the entire re-
ever, much interesting social research requires the in some manner (for example, a stratified, system- population.
arranged list. The sample size for the study was ini-
selection of samples from populations that cannot atic sample). Next, they would obtain lists of mem- Multistage duster sampling, then, involves the
tially set at 1,100. To achieve this sample, the sam-
easily be listed for sampling purposes: the popula- bers from each of the selected churches. Each of repetition of two basic steps: listing and sampling.
pling program was set for a 1/14 sampling ratio. The
tion of a city, state, or nation; all university students the lists would then be sampled, to provide samples The list of primary sampling units (churches, blocks)
program generated a random number between 1
in the United States; and so forth. In such cases, the of church members for study. (For an example, see is compiled and, perhaps, stratified for sampling.
sample design must be much more complex. Such Glock, Ringer, and Babbie 1967.) Then a sample of those units is selected. The se-
a design typically involves the initial sampling of Another typical situation concerns sampling lected primary sampling units are then listed and
cluster sampling A multistage sampling in which
natural groups (clusters) are sampled initially, with the groups of elements-clusters-followed by the se- among population areas such as a city. Although perhaps stratified. The list of secondary sampling
members of each selected group being subsampled af- lection of elements within each of the selected dus- there is no single list of a city's population, citizens units is then sampled, and so forth.
terward. For example, you might select a sample of ters. reside on discrete city blocks or census blocks. Re- Multistage duster sampling makes possible
U.S. colleges and universities from a directory, get lists
Cluster sampling may be used when it's either searchers can, therefore, select a sample of blocks those studies that would otherwise be impossible.
of the students at all the selected schools, then draw
impossible or i mpractical to compile an exhaustive initially, create a list of people living on each of the Consider, for example, the "Santa Claus" survey de-
samples of students from each.
list of the elements composing the target population, selected blocks, and take a subsample of the people scribed in the box entitled "Sampling Santa's Fans."
on each block.

2 1 0 . Chapter 7: The Logic of Sampling

Multistage Designs and Sampling Error Fortunately, homogeneity can be used to ease this FIGURE 7-13
dilemma. Multistage Cluster Sampling
Although duster sampling is highly efficient, the Typically, the elements composing a given nat-
price of that efficiency is a less-accurate sample. A ural cluster within a population are more homoge-

CD

aE
a

simple random sample drawn from a population neous than are all elements composing the total s m
a a 0 a a
list is subject to a single sampling error, but a two- population. The members of a given church are
a
m
a E
c
a
X o
stage duster sample is subject to two sampling er- more alike than are all church members; the resi-
y
a N l m
rors. First, the initial sample of clusters will repre- dents of a given city block are more alike than are o
m
m
sent the population of dusters only within a range the residents of a whole city. As a result, relatively
1st St. ~- Stage One: Identify

3
of sampling error. Second, the sample of elements few elements may be needed to represent a given blocks and select
selected within a given cluster will represent all the natural duster adequately, although a larger num- 2nd St. a sample. (Selected
elements in that cluster only within a range of sam- ber of clusters may be needed to represent ade- blocks are shaded.)
rd St.
pling error. Thus, for example, a researcher runs a quately the diversity found among the clusters.
certain risk of selecting a sample of disproportion- This fact is most clearly seen in the extreme case r4th St.
ately wealthy city blocks, plus a sample of dispro- of very different clusters composed of identical ele- 5th St.
portionately wealthy households within those ments within each. In such a situation, a large num-
blocks. The best solution to this problem lies in the ber of clusters would adequately represent all its
number of clusters selected initially and the num- members. Although this extreme situation never Stage Two: Go to each
ber of elements within each cluster. exists in reality, it's closer to the truth in most cases selected block and list
all households in order.
Typically, researchers are restricted to a total than its opposite: identical clusters composed of (Example of one listed block.)
sample size; for example, you may be limited to grossly divergent elements.
conducting 2,000 interviews in a city. Given this The general guideline for cluster design, then, is 1. 491 Rosemary Ave.
broad limitation, however, you have several op- to maximize the number of dusters selected while 2. 487 Rosemary Ave.
3. 473 Rosemary Ave. Stage Three: For
tions in designing your cluster sample. At the ex- decreasing the number of elements within each
4. 455 Rosemary Ave. each list, select
tremes you could choose one cluster and select cluster. However, this scientific guideline must be 5. 437 Rosemary Ave. - < sample of households.
2,000 elements within that cluster, or you could balanced against an administrative constraint. The 6. 423 Rosemary Ave. (In this example, every
select 2,000 clusters with one element selected efficiency of duster sampling is based on the ability 7. 411 Rosemary Ave. sixth household has
within each. Of course, neither approach is advis- to minimize the listing of population elements. By 8. 403 Rosemary Ave. been selected starting
9. 1101 4th St. with #5, which was
able, but a broad range of choices lies between initially selecting clusters, you need only list the ele-
10. 1123 4th St. selected at random.)
them. Fortunately, the logic of sampling distribu- ments composing the selected dusters, not all ele- 11. 1137 4th St. " <
tions provides a general guideline for this task. ments in the entire population. Increasing the num- 12. 1157 4th St.
Recall that sampling error is reduced by two ber of dusters, however, goes directly against this 13. 1169 4th St.
factors: an increase in the sample size and increased efficiency factor. A small number of clusters may be 14. 1187 4th St.
15. 402 Thyme Ave.
homogeneity of the elements being sampled. These listed more quiddy and more cheaply than a large
16. 408 Thyme Ave.
factors operate at each level of a multistage sample number. (Remember that all the elements in a se- 17. 424 Thyme Ave. "
design. A sample of dusters will best represent all lected duster must be listed even if only a few are 18. 446 Thyme Ave.
clusters if a large number are selected and if all to be chosen in the sample.) 19. 458 Thyme Ave.
clusters are very much alike. A sample of elements The final sample design will reflect these two 20. 480 Thyme Ave.
21. 498 Thyme Ave.
will best represent all elements in a given cluster if constraints. In effect, you'll probably select as many
22. 1186 5th St.
a large number are selected from the cluster and if dusters as you can afford. Lest this issue be left too 23. 1174 5th St. "
all the elements in the cluster are very much alike. open-ended at this point, here is one general guide- 24. 1160 5th St.
With a given total sample size, however, if the line. Population researchers conventionally aim at 25. 1140 5th St.
number of dusters is increased, the number of ele- the selection of 5 households per census block. If a 26. 1122 5th St.
27. 1118 5th St.
ments within a cluster must be decreased. In this total of 2,000 households are to be interviewed, 28. 1116 5th St.
respect, the representativeness of the dusters is in- you would aim at 400 blocks with 5 household in- 29. 1104 5th St. " .
creased at the expense of more poorly representing terviews on each. Figure 7-13 presents a graphic 30. 1102 5th St.
the elements composing each duster, or vice versa. overview of this process.
212 . Chapter 7: The Logic of Sampling Multistage Cluster Sampling . 213

Before we turn to other, more detailed proce- before the next stage of sampling. Typically, how- tative of the city, comprising only single-family Disproportionate Sampling and Weighting
dures available to duster sampling, let me reiterate ever, this is not done. (Recall the assumption of rel- dwellings.
Ultimately, a probability sample is representative of
that this method almost inevitably involves a loss of ative homogeneity within clusters.) Whenever the clusters sampled are of greatly
a population if all elements in the population have
accuracy. The manner in which this appears, how- differing sizes, it's appropriate to use a modified
an equal chance of selection in that sample. Thus,
ever, is somewhat complex. First, as noted earlier, a sampling design called probability proportionate to
in each of the preceding discussions, we've noted
multistage sample design is subject to a sampling
Probability Proportionate size-PPS. This design guards against the problem
that the various sampling procedures result in an
error at each stage. Because the sample size is nec- I've just described and still produces a final sample
essarily smaller at each stage than the total sample
to Size (PPS) Sampling in which each element has the same chance of
equal chance of selection-even though the ulti-
mate selection probability is the product of several
size, the sampling error at each stage will be greater This section introduces you to a more sophisticated selection.
partial probabilities.
than would be the case for a single-stage random form of duster sampling, one that is used in many As the name suggests, each duster is given a
More generally, however, a probability sample
sample of elements. Second, sampling error is esti- large-scale survey sampling projects. In the preced- chance of selection proportionate to its size. Thus,
is one in which each population element has a
mated on the basis of observed variance among the ing discussion, I talked about selecting a random or a city block with 200 households has twice the
known nonzero probability of selection-even
sample elements. When those elements are drawn systematic sample of dusters and then a random or chance of selection as one with only 100 house-
though different elements may have different prob-
from among relatively homogeneous dusters, the systematic sample of elements within each cluster holds. Within each cluster, however, a fixed num-
abilities. If controlled probability sampling proce-
estimated sampling error will be too optimistic and selected. Notice that this produces an overall sam- ber of elements is selected, say, 5 households per
dures have been used, any such sample may be
must be corrected in the light of the cluster sample pling scheme in which every element in the whole block. Notice how this procedure results in each
representative of the population from which it is
design. population has the same probability of selection. household having the same probability of selection
drawn if each sample element is assigned a weight
Let's say we're selecting households within a overall.
equal to the inverse of its probability of selection.
city. If there are 1,000 city blocks and we initially Let's look at households of two different city
Thus, where all sample elements have had the same
select a sample of 100, that means that each block blocks. Block A has 100 households, Block B has
Stratification in Multistage Cluster Sampling has a 100/1,000 or. 1 chance of being selected. If only 10. In PPS sampling, we would give Block A
chance of selection, each is given the same weight:
1. This is called a self-weighting sample.
Thus far, we've looked at duster sampling as though we next select 1 household in 10 from those resid- ten times as good a chance of being selected as
Sometimes it's appropriate to give some cases
a simple random sample were selected at each ing on the selected blocks, each household has a Block B. So if, in the overall sample design, Block A
more weight than others, a process called weight-
stage of the design. In fact, stratification techniques . I chance of selection within its block. To calculate has a 1/20 chance of being selected, that means
ing. Disproportionate sampling and weighting come
can be used to refine and improve the sample being the overall probability of a household being se- Block B would only have a 1/200 chance. Notice
into play in two basic ways. First, you may sample
selected. lected, we simply multiply the probabilities at the that this means that all the households on Block A
subpopulations disproportionately to ensure suffi-
The basic options here are essentially the same individual steps in sampling. That is, each house- would have a 1/20 chance of having their block
cient numbers of cases from each for analysis. For
as those in single-stage sampling from a list. In se- hold has a 1/10 chance of its block being selected selected; Block B households have only a 1/200
example, a given city may have a suburban area
lecting a national sample of churches, for example, and a 1/10 chance of that specific household being chance.
containing one-fourth of its total population. Yet
you might initially stratify your list of churches by selected if the block is one of those chosen. Each if Block A is selected and we're taking 5 house-
you might be especially interested in a detailed
denomination, geographical region, size, rural or household, in this case, has a 1/10 X 1/10 = 1/100 holds from each selected block, then the households
analysis of households in that area and may feel
urban location, and perhaps by some measure of chance of selection overall. Because each house- on Block A have a 5/100 chance of being selected
that one-fourth of this total sample size would be
social lass. hold would have the same chance of selection, the into the block's sample. Since we can multiply prob-
too few. As a result, you might decide to select the
Once the primary sampling units (churches, sample so selected should be representative of all abilities in a case like this, we see that every house-
same number of households from the suburban
blocks) have been grouped according to the rele- households in the city. hold on Block A had an overall chance of selection
area as from the remainder of the city. Households
vant, available stratification variables, either simple There are dangers in this procedure, however. equal to 1/20 X 5/100 = 5/2000 = 1/400.
random or systematic sampling techniques can be In particular, the variation in the size of blocks If Block B happens to be selected, on the other
used to select the sample. You might select a speci- ( measured in numbers of households) presents a hand, its households stand a much better chance of
PPS (probability proportionate to size) This refers
fied number of units from each group, or stratum, problem. Let's suppose that half the city's popula- being among the 5 chosen there: 5/10. When this
to a type of multistage duster sample in which dusters
or you might arrange the stratified clusters in a con- tion resides in 10 densely packed blocks filled with is combined with their relatively poorer chance of
are selected not with equal probabilities (see EPSEM)
tinuous list and systematically sample that list. high-rise apartment buildings, and suppose that the having their block selected in the first place, how- but with probabilities proportionate to their sizes-as
To the extent that clusters are combined into rest of the population lives in single-family dwell- ever, they end up with the same chance of selec- measured by the number of units to be subsampled.
homogeneous strata, the sampling error at this stage ings spread out over the remaining 900 blocks. tion as those on Block A: 1/200 X 5/10 = 5/2000 = weighting Assigning different weights to cases that
will be reduced. The primary goal of stratification, When we first select our sample of 1/10 of the 1/400. were selected into a sample with different probabilities
Further refinements to this design make it a of selection. In the simplest scenario, each case is given
as before, is homogeneity. blocks, it's quite possible that we'll miss all of the
a weight equal to the inverse of its probability of selec-
tion. When all cases have the same chance of selection,
There's no reason why stratification couldn't 10 densely packed high-rise blocks. No matter what very efficient and effective method for selecting
take place at each level of sampling. The elements happens in the second stage of sampling, our final large cluster samples. For now, however, it's no weighting is necessary.
listed within a selected duster might be stratified sample of households will be grossly unrepresen- enough to understand the basic logic involved.

21 4 . Chapter 7: The Logic of Sampling Main Points . 215

in the suburban area, then, are given a dispropor- age. What they mean, of course, is that they wanted Probability Sampling in Review but other times nonprobability techniques are
tionately better chance of selection than are those to get a substantial or "large enough" response from more appropriate.
located elsewhere in the city. women, and oversampling is a perfectly acceptable Much of this chapter has been devoted to the
Nonprobability sampling techniques include
As long as you analyze the two area samples way of accomplishing that. key sampling method used in controlled survey
relying on available subjects, purposive or judg-
separately or comparatively, you need not worry By sampling more women than a straightfor- research: probability sampling. In each of the varia-
mental sampling, snowball sampling, and quota
about the differential sampling. If you want to com- ward probability sample would have produced, the tions examined, we've seen that elements are cho-
sampling. In addition, researchers studying a
bine the two samples to create a composite picture authors were able to "select" enough women (812) sen for study from a population on a basis of ran-
social group may make use of informants. Each
of the entire city, however, you must take the dis- to compare with the men (960). Thus, when they dom selection with known nonzero probabilities.
of these techniques has its uses, but none of
proportionate sampling into account. If n is the report, for example, that 32 percent of the women Depending on the field situation, probability them ensures that the resulting sample is repre-
number of households selected from each area, and 66 percent of the men agree that "the amount sampling can be either very simple or extremely
sentative of the population being sampled.
then the households in the suburban area had a of sexual harassment at work is greatly exagger- difficult, time consuming, and expensive. Whatever
Probability sampling methods provide an ex-
chance of selection equal ton divided by one-fourth ated,' we know that the female response is based the situation, however, it remains the most effective
cellent way of selecting representative samples
of the total city population. Because the total city on a substantial number of cases. That's good. There method for the selection of study elements. There
from large, known populations. These methods
population and the sample size are the same for are problems, however. are two reasons for this.
counter the problems of conscious and uncon-
both areas, the suburban-area households should To begin with, subscriber surveys are always First, probability sampling avoids researchers'
scious sampling bias by giving each element in
be given a weight of 14n, and the remaining house-
1
problematic. In this case, the best the researchers conscious or unconscious biases in element selec-
the population a known (nonzero) probability
holds should be given a weight of 'An. This weight- can hope to talk about is "what subscribers to Har- tion. If all elements in the population have an equal
of selection.
ing procedure could be simplified by merely giving vard Business Review think." In a loose way, it might (or unequal and subsequently weighted) chance
of selection, there is an excellent chance that the
The key to probability sampling is random
a weight of 3 to each of the households selected make sense to think of that population as repre-
selection.
outside the suburban area. senting the more sophisticated portion of corporate sample so selected will dosely represent the popu-
Here's an example of the problems that can be management. Unfortunately, the overall response lation of all elements. The most carefully selected sample will never

created when disproportionate sampling is not ac- rate was 25 percent. Although that's quite good for Second, probability sampling permits estimates provide a perfect representation of the popula-
companied by a weighting scheme. When the Har- subscriber surveys, it's a low response rate in terms of sampling error. Although no probability sample tion from which it was selected. There will al-
will be perfectly representative in all respects, con- ways be some degree of sampling error.
vard Business Review decided to survey its subscribers of generalizing from probability samples.
on the issue of sexual harassment at work, it seemed Beyond that, however, the disproportionate trolled selection methods permit the researcher to By predicting the distribution of samples with
appropriate to oversample women because female sample design creates another problem. When the estimate the degree of expected error. respect to the target parameter, probability
subscribers were vastly outnumbered by male sub- authors state that 73 percent of respondents favor In this lengthy chapter, we've taken on a basic sampling methods make it possible to estimate
scribers. Here's how G. C. Collins and Timothy company policies against harassment ( Collins and issue in much social research: selecting observations the amount of sampling error expected in a
Blodgett explained the matter: Blodgett 1981:78), that figure is undoubtedly too that will tell us something more general than the given sample-
high, since the sample contains a disproportion- specifics we've actually observed. This issue con-
We also skewed the sample another way: to
The expected error in a sample is expressed
ately high percentage of women-who are more fronts field researchers, who face more action and in terms of confidence levels and confidence
ensure a representative response from women,
likely than men to favor such policies. And, when more actors than they can observe and record fully,
intervals.
we mailed a questionnaire to virtually every
the researchers report that top managers are more as well as political pollsters who want to predict an
female subscriber, for a male/female ratio of A sampling frame is a list or quasi list of the
likely to feel that claims of sexual harassment are election but can't interview all voters. As we pro-
68% to 32%. This bias resulted in a response of members of a population. It is the resource
exaggerated than are middle- and lower-level man- ceed through the book, we'll see in greater detail
52% male and 44% female (and 4% who gave used in the selection of a sample. A sample's
agers (1981:81), that finding is also suspect. As the how social researchers have found ways to deal
no indication of gender)-compared to HBR's representativeness depends directly on the ex-
researchers report, women are disproportionately with this issue.
U.S. subscriber proportion of 93% male and tent to which a sampling frame contains all the
represented in lower management. That alone
7% female. members of the total population that the sample
might account for the apparent differences among
is intended to represent.
levels of management. In short, the failure to take MAIN POINTS-
(1981:78)
Several sampling designs are available to
Notice a couple of things in this excerpt. First, account of the oversampling of women confounds
researchers.
it would be nice to know a little more about what all survey results that don't separate the findings by Social researchers must select observations that
"virtually every female' means. Evidently, the au- gender. The solution to this problem would have will allow them to generalize to people and Simple random sampling is logically the most

thors of the study didn't send questionnaires to all been to weight the responses by gender, as de- events not observed. Often this involves sam- fundamental technique in probability sampling,
pling, a selection of people to observe. but it is seldom used in practice.
female subscribers, but there's no indication of who scribed earlier in this section.
was omitted and why. Second, they didn't use the Sometimes you can and should select probabil- Systematic sampling involves the selection of
term representative with its normal social science us- ity samples using precise statistical techniques, every kth member from a sampling frame. This

21 6 - Chapter7: The Logic of Sampling Resources on the Internet . 217

method is more practical than simple random systematic sampling duster sampling passages intermingle as Kish exhausts every- Bill Trochim, Probability Sampling 1
sampling; with a few exceptions, it is function- sampling interval PPS thing you could want or need to know about http://trochim.human.cornell.edu/kb/sampprob.htm
ally equivalent. sampling ratio weighting each aspect of sampling.
Survey Sampling, Inc., The Frame
Stratification, the process of grouping the mem- stratification Sudman, Seymour. 1983. 'Applied Sampling."
http://www.worldopinion.com/the_framel
bers of a population into relatively homoge- Pp. 145-94 in Handbook of Survey Research, ed-
ited by Peter H. Rossi, James D. Wright, and Bureau of Labor Statistics and Census Bureau,
neous strata before sampling, improves the rep-
Andy B. Anderson. New York: Academic Press. Sampling
resentativeness of a sample by reducing the REVIEW QUESTIONS AND EXERCISES An excellent, practical guide to survey sampling. http://www.bls.census.gov/cps/bsampdes.htm
degree of sampling error.
Multistage duster sampling is a relatively com- 1. Review the discussion of the 1948 Gallup Poll that I NFOTRAC COLLEGE EDITION
predicted that Thomas Dewey would defeat Harry
plex sampling technique that frequently is used
Truman for president. What are some ways Gallup &ESOURCES 0# THE INTER ~^ http://www.infotrac-college.com/wadsworth/
when a list of all the members of a population access.html
could have modified his quota sample design to
does not exist. Typically, researchers must bal- avoid the error? VIRTUAL SOCIETY'S COMPANION WEB SITE FOR THE
ance the number of dusters and the size of each Access the latest news and journal artides with Info-
2. Using Appendix C of this book, select a simple PRACTICE OF SOCIAL RESEARCH, 10TH EDITION
Trac College Edition, an easy-to-use online database of
cluster to achieve a given sample size. Stratifi- random sample of 10 numbers in the range of 1 to
http://www.wadsworth.com/sociology reliable, full-length articles from hundreds of top aca-
cation can be used to replace the sampling er- 9,876. What is each step in the process?
Once at the Virtual Society, click on "Find Companion demic journals. Conduct an electronic search using
ror involved in multistage duster sampling. 3. What are the steps involved in selecting a multi- Sites" from the left navigation bar, dick on "Research the following terms:
Probability proportionate to size (PPS) is a spe- stage cluster sample of students taking first-year Methods and Statistics,' and then click on your book Cluster sample Quota sample
cial, efficient method for multistage duster English in U.S. colleges and universities? cover. On the companion site, you will find useful
Confidence interval Sampling bias
sampling. 4. In Chapter 9, we'll discuss surveys conducted on learning resources for your course. Some of those re-
the Internet. Can you anticipate possible problems sources include Tutorial Quizzes with feedback, Inter- Confidence level Sampling distribution
If the members of a population have unequal
Multistage sample Sampling error
concerning sampling frames, representativeness, net Exercises, Flashcards, and Chapter Tutorials for
probabilities of selection into the sample, re-
and the like? Do you see any solutions? every chapter, as well as Extended Projects, Social Re- Nonprobability sample Sampling frame
searchers must assign weights to the different search in Cyberspace, and Primers for using various
5. Using InfoTrac College Edition, locate studies us- Probability sample Stratified sample
observations made in order to provide a repre- data analysis software such as SPSS and NVivo.
ing (1) a quota sample, (2) a multistage duster
sentative picture of the total population. Basi-
sample, and (3) a systematic sample. Write a brief
cally, the weight assigned to a particular sample WEB LINKS FOR THIS CHAPTER
description of each study.
member should be the inverse of its probability
Please realize that the Internet is an evolving
of selection. entity, subject to change. Nevertheless, these
ADDITIONAL READINGS . . few Web sites should be fairly stable.

KEY TERMS_, Frankfort-Nachmias, Chava, and Anna Leon-


Guerrero. 2000. Social Statistics for a Diverse So-
ciety. 2nd ed. Thousand Oaks, CA: Pine Forge
The following terms are defined in context in the
chapter and at the bottom of the page where the term Press. See Chapter I1 especially. This statistics
is introduced, as well as in the comprehensive glossary textbook covers many of the topics we've dis-
at the back of the book. cussed in this chapter but in a more statistical
context. It demonstrates the links between
nonprobability sampling study population probability sampling and statistical analyses.
purposive (judgmental) random selection Kalton, Graham. 1983. Introduction to Survey Sam-
sampling sampling unit pling. Newbury Park, CA: Sage. Kalton goes into
snowball sampling parameter more of the mathematical details of sampling
quota sampling than the present chapter does, without attempt-
statistic
ing to be as definitive as Kish, described next.
informant sampling error
probability sampling Kish, Leslie. 1965. Survey Sampling. New York: Wi-
confidence level
ley. Unquestionably the definitive work on sam-
representativeness confidence interval
pling in social research. Kish's coverage ranges
EPSEM sampling frame from the simplest matters to the most complex
element simple random and mathematical, both highly theoretical and
population sampling downright practical. Easily readable and difficult

Вам также может понравиться