Вы находитесь на странице: 1из 8

Randomized Response: A Survey Technique for Eliminating Evasive Answer Bias Author(s): Stanley L.

Warner Source: Journal of the American Statistical Association, Vol. 60, No. 309 (Mar., 1965), pp. 6369 Published by: American Statistical Association Stable URL: http://www.jstor.org/stable/2283137 . Accessed: 24/12/2013 08:44
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.

American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal of the American Statistical Association.

http://www.jstor.org

This content downloaded from 140.160.178.168 on Tue, 24 Dec 2013 08:44:48 AM All use subject to JSTOR Terms and Conditions

RANDOMIZED RESPONSE: A SURVEY TECHNIQUE FOR ELIMINATING EVASIVE ANSWER BIAS


STANLEY

L. WARNER

Claremont Graduate School For various reasons individualsin a sample surveymay prefer not to confide to the interviewer the correct answersto certainquestions. In such cases the individualsmay elect not to replyat all or to reply withincorrect answers.The resulting evasive answerbias is ordinarily difficult to assess. In thispaper it is arguedthatsuch bias is potentially removable through allowing the intervieweeto maintain privacy through the deviceofrandomizing A randomized his response. response methodforestimating a populationproportion is presentedas an example. Unbiased maximum are obtainedand their estimates likelihood mean square errors are comparedwiththe mean square errorsof conventional estimatesunder various assumptionsabout the underlying population.

reluctanceto confide secrets to strangers, manyindividuals attempt to evade certain questions put to themby interviewers. In surveyvernacular, these peoplebecomethe 'non-cooperative"} group [5, pp. 235-72],either refusing outright to be surveyed, or consenting to be surveyed butpurposely providing wrong answers to thequestions. In theone case there is theproblem ofrefusal bias [1, pp. 355-61], [2, pp. 33-6], [5,pp. 261-9]; in the othercase thereis the ofresponse problem bias [3,p. 89], [4, pp. 280-325]. The questions thatpeopletendto evade are the questions whichdemand answers that are too revealing. Innocuousquestions ordinarily receivegood response, but questions requiring personal or controversial assertions excite resistance. Whenresistance is encountered, theusualmodification ofthesurvey method is simply an addedeffort on thepartoftheinterviewer to gaintheconfidence oftheinterviewee. Thereis,however, a natural reticence ofthegeneral individual to confide certain things to anyone-letalonea stranger-and there is also a natural reluctance to haveconfidential statements on a papercontaininghis nameand address. For somequestions at least,probably onlylimited gains arepossible through trying to persuade theinterviewee thathe surrenders little by confiding to theinterviewer. This paper suggests an alternate method The forincreasing cooperation. method is builton thepremise thatcooperation shouldbe naturally better if thequestions allowanswers which revealless evento the interviewer. Essentially the method involves the devicethat-forcertain questions notalready innocuous-the interviewee responds withanswers that furnish information onlyon a probability basis.As an example, one application might the involve interviewee's onlymaking a truestatement witha givenprobability less than 1. In thiscase,eventheinterviewer wouldknowonlytheprobability thatthe given answer was true.Inasmuch as thistypeofanswer is lessrevealing than an answer required to be truthful withprobability 1, it is suggested thatthis 63

FORreasonsofmodesty, fearofbeingthought or merely a bigoted,

1. INTRODUCTION

This content downloaded from 140.160.178.168 on Tue, 24 Dec 2013 08:44:48 AM All use subject to JSTOR Terms and Conditions

64

AMERICAN

STATISTICAL

ASSOCIATION

JOURNAL,

MARCH 1965

forcertain survey probcooperation mayencourage greater typeof approach oftherandomized method, response As another more detailed application lems. a population a particular modelforestimating section outlines thefollowing withconventional estiare thencompared estimates proportion. The resulting ofthoseinterviewed. aboutthe cooperation assumptions matesunder various
2. A RANDOM RESPONSE MODEL FOR PROPORTIONS

to either GroupA or belongs Supposethat everypersonin a population to theproportion to estimate B and it is required belonging by survey Group with ofn peopleis drawn from A. A simple replacement random sample Group to be interviewed. Before and provisions madeforeach person thepopulation an furnished with identical witha is each interviewer spinner theinterviews, A withprobability to theletter p and facemarked so thatthespinner points in eachinterview, theinterviewee B with probability (1- p). Then, totheletter only and report by the interviewer unobserved is askedto spinthe spinner to which thegroup to theletter representing whether ornotthespinner points is required That is,theinterviewee onlyto say yesor theinterviewee belongs. he does ornotthespinner to thecorrect group; points noaccording to whether that Underthe assumption the spinner points. to which notreport thegroup estimates maximum likelihood of aremadetruthfully, these yesand no reports arestraightforward. proportion thetruepopulation Let ofA in thepopulation, 7r=thetrueprobability to A, and thatthespinner points p =the probability I ith element yes if the says sample xi= 0l iftheith sampleelement saysno. Then
P(Xi P(Xj
=

1) = rp +(-)
=

(1 (1

p),

= 0)

(1

T)p +

-p),

of the sampleso that the first the indexing and arranging ni report"yes" ofthe sampleis the second(n-n1) report while "no,"the likelihood L
= [irp + (1 p)]n-[(l r)p + 7r(l
-

p)]n-n1

(1)

is The log ofthelikelihood


logL
=

nilog [rp + (1

r)(-p)]

+ (n-ni) (n or
7rp +

log [(1 - r)p + r(l -p)], 1)


-

(2)

fora maximumare and necessaryconditionson 7r


-

n1)(2p

nl(2p rp+ (-r)(


-

1) -p)

(1-X)p

+ r(1-p)
(1 -

7r) (1

p)

(3)

This content downloaded from 140.160.178.168 on Tue, 24 Dec 2013 08:44:48 AM All use subject to JSTOR Terms and Conditions

RANDOMIZED

RESPONSE

65
p-i _

of 7ris likelihood estimate p 1/2,the maximum Then,supposing


e

2p-1

n (2p-1)n
____

(4)

is The expected value oftheestimate


D

r-

[p -1 + (1/n) ,EX,]
+ 7rp + (1-lr)(1-p)]

=-7r,

p_1 [p-1

(5)

of * is and thevariance n VarXi


(2p -1)In
[rp + (1 -ir)(1-p)][( -7)p + 7r(l -p)]

1/4+ (2p2iF 1

(2p - 1)2n - 1/2) 2p + 1/2)(-2ir2 + 27r (2p


-1)2n

= 1r 1 n - 16(p - 1/2)2

_ (X--1/2)21
1

(6)

proporestimate ofthetruepopulation * is an unbiased Expression (5) shows estimate and anyuseful likelihood n's tionir.1 since1ris a maximum Moreover, about7r withthe distributed are apt to be large, ? maybe assumed normally intervals in expression variance indicated (6). Thus all the usual confidence of are easilyestablished. dependence Expression (6) also setsout the separate of ruponthechoice ofp. In fact, identifying thevariance
-4_

(ir.

1/2)2
_-_

( 71-r)

n 1 Var r =4 Var 7` r=

n 1
1/2)2

and writing as thevariance due to sampling (6) as expression


-

(7r -

1 4
'

+ 16(p -1/2)2 n

(7)

due as thesumofthevariance itis clearthatthevariance of* canbe expressed device. dueto therandom to sampling plusthevariance method theestimation implied by *. First, Two practical concern questions in whenaskedto respond and tellthetruth howlikely are peopleto cooperate
I The possibility is remote in takingvalues outsidethe 0-1 rangecannot be ruledout, but thispossibility of 79 largesamples.

This content downloaded from 140.160.178.168 on Tue, 24 Dec 2013 08:44:48 AM All use subject to JSTOR Terms and Conditions

66

AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1965

the mannerdescribed?Second, how large a sample is requiredto obtain various estimate? degreesofprecisionby this estimateas comparedto the conventional an empiricalquestion, but the rationale for The firstquestion is primarily is asked expectingbettercooperationis clear. The individualbeinginterviewed forless. The matterof how muchless is summarizedby the parameterp. Note does not even first fromexpression(1) that if p = 1/2, the likelihoodfunction would be furnishing no infordepend on ir. Thus, fora p = 1/2,the interviewee mationat all. Then note that ifp= 1, the entireprocedurewould reduceto the conventional procedure of requiring the individual to state unreservedly or not he belongedto Group A. For p's between1/2 and 1 (or between whether 1/2 and 0) the person interviewed provides useful but not absolute informationas to exactlywhichgrouphe is in. In thiscontextthe p can be thoughtofas and the interthe natureofthe cooperationbetweenthe interviewer describing viewee. As p goes from 1 to 1/2 the burden of cooperatingpasses fromthe It therefore seems reasonableto expectthat for interviewee to the interviewer. some questionsat least, p's less than 1 shouldinduce greatercooperationon the part ofthe personinterviewed. The question of the sample size requiredfora given level of precisionalso dependson the parameterp. If a p close to 1 (or close to 0) is adequate to insure cooperation,then a smallersample size is requiredthan if a p close to 1/2 is Values of p close to 1/2 conveyless information requiredto insurecooperation. from each interview, thustheyalso implyeithera largervarianceofthe estimate values of p in expression(6) sets out the or a largersample size. Substituting precise relation.As an example, supposinga 7r=.5 and a p halfwaybetween the zero and fullinformation points,i.e., a p of .75, the variance shownby (6) is 1/n. This would imply that the sample size should be about 400 in orderto secure a standard deviation of .05. By way of comparison,the conventional estimationmethod (equivalent to a p =1) would implythat a sample of only for a standard deviation of .05-provided that about 100 would be sufficient all the interviewees told the truthforthe regularmethod. The more pertinent comparisonsare betweenthe randomizedestimates.and regularestimatesunder the assumptionthat the regularestimatesare handicapped by less than 100 per centtruthfulness. Suppose that in a regularsurvey all consentto be surveyed,but membersof Group A tell the truthonly with probabilityTa and membersof Group B tell the truthonly with probability Tb. Then, if Y,= 1 or 0 accordingas the ith memberof the sample reportshe is or is not in Group A, the conventionalestimate of the true population proportionr is n
7r = -

*(8)

The expectedvalue, responsebias [3, p. 89], and variance of this regularestimate are given by
EV = 7rTa +

[(1 -7r)(1 -

Tb)],

(9)

This content downloaded from 140.160.178.168 on Tue, 24 Dec 2013 08:44:48 AM All use subject to JSTOR Terms and Conditions

RANDOMIZED

RESPONSE

67
[1-Tb],
Tb)][O1 -

Bias' a

--E(7r-7r)

Var7r 11

+ =r[Ta+ Tb-2] [rTa + (1- ir)(1 -

and
rT4 -(1-

(10) 7r)(1Tb)]

Tables 1 and 2 then compare the mean square errors(the variance plus the square ofthe bias) of the randomizedand regularmethodsof estimation under theassumption that the interviewed individualstell thetruthin therandomized methodbut only tell the truthin the non-random method with probabilities givenby Ta and Tb. The left-hand two columnsof each table indicate various
TABLE 1. COMPARISON OF RANDOMIZED AND REGULAR ESTIMATES FOR TRUE PROBABILITY OF A=.6 ANDr n=1000 Regular Estimates
______ _

Mean Square ErrorRandomized


. Mean Square Error Regular p=.6 p=.7 p=.8 p =.9

Probabilityof Truth
To To

Bias la -.03 -.06 -.18 -.30 .02 .04 .12 .20 -.01 -.02 -.06 -.10

.95 .90 .70 .50 1.00 1.00 1.00 1.00 .95 .90 .70 .50

1.00 1.00 1.00 1.00 .95 .90 .70 .50 .95 .90 .70 .50

5.45 1.62 .19 .07 9.82 3.41 .43 .16 18.25 9.70 1.62 .61

1.36 .40 .05 .02 2.44 .85 .11 .04 4.54 2.41 .40 .15

.60 .18 .02 .01 1.08 .37 .05 .02 2.00 1.06 .18 .07

.33 .10 .01 .00 .60 .21 .03 .01 1.11 .59 .10 .04

TABLE 2. COMPARISON OF RANDOMIZED AND REGULAR ESTIMATES FOR TRUE PROBABILITY OF A=.5 AND n=1000 Regular Estimates
_____________________ -

Mean Square ErrorRandomized .


Mean Square Error Regular

Probabilityof Truth

Ta

Tb

Bas

P=a

p.6

p=.7 1.79 .57 .07 .03 6.25 6.25 6.25 6.25

p=.8 .79 .25 .03 .01 2.78 2.78 2.78 2.78

p-.9

.95 .90 .70 .50 .95 .90 .70 .50

1.00 1.00 1.00 1.00 .95 .90 .70 .50

-.03 -.05 -.15 -.25 .00 .00 .00 .00

7.15 2.28 .28 .10 25.00 25.00 25.00 25.00

.45 .14 .02 .01 1.56 1.56 1.56 1.56

This content downloaded from 140.160.178.168 on Tue, 24 Dec 2013 08:44:48 AM All use subject to JSTOR Terms and Conditions

68

AMERICAN

STATISTICAL

ASSOCIATION

JOURNAL,

MARCH 1965

Ta and Tb. Thethird paired valuesfor column shows thebiasofthenon-random method, andtheremaining columns exhibit theratios ofthemeansquareerrors oftherandomized estimates to themeansquareerrors oftheregular estimates for various valuesofp. Tables1 and 2 arerespectively for thecases appropriate thetrueprobability where ofA is .6 and .5. The samplesize is set at 1000in eachcase.
3. CONCLUSIONS

Bothtablesare constructed under theassumption thatthep in each case is lowenough to inducefullcooperation in therandomized approach. Thus the oftherandomized advantages inthetablesthat shown method, bythose ratios arelessthan1,arein thenature ofpotential advantages thatdepend uponthe cooperation actually achieved by therandomized method. Nevertheless, there is theclearsuggestion thattherandomized method is apt to out-perform the regular method in a variety of situations. Table 1 withTa= 1 and Tb= .9, for inwhich exhibits thesituation oftheminority members B population example, to their resent their directly confiding interviewer statusto thepoint minority tenpercentofthem where ofB. The bias created is +.04, and say A instead theratioofmeansquareerrors varies from 3.41to .21,depending on thevalue ofp. The possible in thiscase is evident. randomization improvement through Anevengreater is possible ifit is thelarger improvement thathesipopulation tatesto identify itself openly.This lattercase is exemplified by the rowin which Ta.=9 and Tb=1. it is to be observed Moregenerally thecaseswhere that-exceptfor thebias oftheregular is 0 or negligible-there estimate appearto be sizablepotential therandomized It should gains through also be keptinmind response. thatthe areevenlarger ofrandomizing potential for advantages larger samples.For exa sample sizeof2000would inTable1,column thattheentry ample, imply 4,row 1.62 to .84. Thus the randomized 2, wouldchangefrom is to be premethod in thisinstance ferred evenifa p as lowas .6 is required to assure cooperation. is stillopenas to whatmethods The question of randomized response will themost useful. Evenwith to estimating prove regard themethod proportions, setout in Section2 is onlyone ofmanypossibilities. It is interesting to note inthis connection thata mathematically model to theoneofSection equivalent 2 is furnished each interviewee by simply requiring to makea statement that is truewithprobability p as to which ofthetwogroups he is in. Thus in this the interviewee, model, again out ofsightofthe interviewer, spinsa spinner which to "true" with withprobability points given probability p and to "false" (1- p). Thentheinterviewee makesa statement thatis trueorfalseaccording thiswouldappearto be quite to thewaythespinner pointed. Psychologically from thatofSection a different model ofthetwo 2, butthestatistical properties areequivalent.2 Themaximum likelihood for estimate thelatter models scheme hasthesameform and thesamevariance as theestimate ofSection 2. Thereis
2As before, a p of 1 furnishes a p of i furnishes no information, fullinformation, and otherp's furnish informaon how fartheyare fromJ. It is a feature tiondepending of the dichotomous natureof thepopulationthattelling the truth .2 of the timeis equivalentto tellingthe truth.8 of the time.

This content downloaded from 140.160.178.168 on Tue, 24 Dec 2013 08:44:48 AM All use subject to JSTOR Terms and Conditions

RANDOMIZED RESPONSE

69

is randomized models equivalent oftheseor other as to which thusa question cooperation. ofincreasing from thestandpoinlt to be preferred response therandomized it should be notedthatit is easyto extend Finally, to a simple otherthan that appropriate distributions to estimate technique thetechnique couldbe appliedto estiAs one example, dichotomous variable. deviceofestimating theobvious through distribution income matea five-class ofSection 2. In thiscase bythemethod in eachclassseparately theproportion randomized reseparate askedto makefive be simply might eachinterviewee classes. separate or nothe was in each ofthefive whether concerning sponses response randomized it is clearthatother as with theproportion Just problem, Andjust problem. estimation forthismoregeneral methods maybe imagined will technique ofwhich specific the question problem, as withthe proportion investigation. for empirical is a matter superior prove
4. ACKNOWLEDGMENTS

suggestions. helpful for I am indebted to thereferee


REFERENCES

Second Edition. New York: JohnWiley and (1] Cochran,W. G., Sampling Techniques, Sons, Inc., 1963. ofSampling.New York: JohnWileyand Sons,Inc., 1950. 12] Deming,W. E., SomeTheory and [31 Hansen, M. H., Hurwits,W. N., and Madow, W. G., Sample SurveyMethods Volume I. New York: JohnWileyand Sons, Inc., 1953. Theory, [4] Hansen, M. H., Hurwitz,W. N., and Madow, W. G., Sample SurveyMethodsand VolumeII. New York: JohnWileyand Sons, Inc., 1953. Theory, [5] Stephan,F. F., and McCarthy,P. J.,SamplingOpinions.New York: JohnWiley and Sons, Inc., 1963.

This content downloaded from 140.160.178.168 on Tue, 24 Dec 2013 08:44:48 AM All use subject to JSTOR Terms and Conditions

Вам также может понравиться