Вы находитесь на странице: 1из 14

American Economic Association

Let's Take the Con Out of Econometrics


Author(s): Edward E. Leamer
Source: The American Economic Review, Vol. 73, No. 1 (Mar., 1983), pp. 31-43
Published by: American Economic Association
Stable URL: http://www.jstor.org/stable/1803924
Accessed: 22-03-2015 11:12 UTC

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
http://www.jstor.org/page/info/about/policies/terms.jsp

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content
in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship.
For more information about JSTOR, please contact support@jstor.org.

American Economic Association is collaborating with JSTOR to digitize, preserve and extend access to The American Economic
Review.

http://www.jstor.org

This content downloaded from 202.28.191.34 on Sun, 22 Mar 2015 11:12:45 UTC
All use subject to JSTOR Terms and Conditions
Let's Take the Con out of Econometrics

By EDWARD E. LEAMER*

Econometricians would like to project the One should not jump to the conclusion
image of agricultural experimenters who di- that there is necessarily a substantive dif-
vide a farm into a set of smaller plots of land ference between drawing inferences from ex-
and who select randomly the level of fertiliz- perimental as opposed to nonexperimental
er to be used on each plot. If some plots are data. The images I have drawn are de-
assigned a certain amount of fertilizer while liberately prejudicial. First, we had the ex-
others are assigned none, then the difference perimental scientist with hair neatly combed,
between the mean yield of the fertilized plots wide eyes peering out of horn-rimmed glasses,
and the mean yield of the unfertilized plots is a white coat, and an electronic calculator for
a measure of the effect of fertilizer on agri- generating the random assignment of fertiliz-
cultural yields. The econometrician's humble er treatment to plots of land. This seems to
job is only to determine if that difference is contrast sharply with the nonexperimental
large enough to suggest a real effect of fertil- farmer with overalls, unkempt hair, and bird
izer, or is so small that it is more likely due droppings on his boots. Another image,
to random variation. drawn by Orcutt, is even more damaging:
This image of the applied econometrician's "Doing econometrics is like trying to learn
art is grossly misleading. I would like to the laws of electricity by playing the radio."
suggest a more accurate one. The applied However, we need not now submit to the
econometrician is like a farmer who notices tyranny of images, as many of us have in the
that the yield is somewhat higher under trees past.
where birds roost, and he uses this as evi-
dence that bird droppings increase yields. I. Is Randomization Essential?
However, when he presents this finding at
the annual meeting of the American Ecologi- What is the real difference between these
cal Association, another farmer in the audi- two settings? Randomization seems to be the
ence objects that he used the same data but answer. In the experimental setting, the
came up with the conclusion that moderate fertilizer treatment is "randomly" assigned
amounts of shade increase yields. A bright to plots of land, whereas in the other case
chap in the back of the room then observes nature did the assignment. Now it is the
that these two hypotheses are indistinguish- tyranny of words that we must resist. "Ran-
able, given the available data. He mentions dom" does not mean adequately mixed in
the phrase "identification problem," which, every sample. It only means that on the
though no one knows quite what he means, average, the fertilizer treatments are ade-
is said with such authority that it is totally quately mixed. Randomization implies that
convincing. The meeting reconvenes in the the least squares estimator is "unbiased,"
halls and in the bars, with heated discussion but that definitely does not mean that for
whether this is the kind of work that merits each sample the estimate is correct. Some-
promotion from Associate to Full Farmer; times the estimate is too high, sometimes too
the Luminists strongly opposed to promo- low. I am reminded of the lawyer who re-
tion and the Aviophiles equally strong in marked that " when I was a young man I lost
favor. many cases that I should have won, but
when I grew older I won many that I should
have lost, so on the averagejustice was done."
*Professor of economics, University of California-Los
Angeles. This paper was a public lecture presented at In particular, it is possible for the random-
the University of Toronto, January 1982. I acknowledge ized assignment to lead to exactly the same
partial support by NSF grant SOC78-09479. allocation as the nonrandom assignment,

31

This content downloaded from 202.28.191.34 on Sun, 22 Mar 2015 11:12:45 UTC
All use subject to JSTOR Terms and Conditions
32 THE AMERICAN ECONOMIC REVIEW MARCH 1983

namely, with treated plots of land all being where a* = yro and /3* = yrl. The linear re-
under trees and with nontreated plots of gression of Y on F provides estimates of the
land all being away from trees. I submit that, parameters of the conditional distribution of
if this is the outcome of the randomization, Y given F, and in this case the regression
then the randomized experitnent and the coefficients are estimates not of a and /3, but
nonrandomized experiment are exactly the rather of a + a* and /B+ /3*. The parameters
same. Many econometricians would insist a* and ,B* measure the bias in the least
that there is a difference, because the ran- squares estimates. This bias could be due to
domized experiment generates " unbiased" left-out variables, or to measurement errors
estimates. But all this means is that, if this in F, or to simultaneity.
particular experiment yields a gross overesti- When observing a nonexperiment, the bias
mate, some other experiment yields a gross parameters a* and /3* can be thought to be
underestimate. small, but they cannot sensibly be treated as
Randomization thus does not assure that exact zeroes. The notion that the bias param-
each and every experiment is ""adequately eters are small can be captured by the as-
mixed," but randomization does make "ade- sumption that a* and /3* are drawn from a
quate mixing" probable. In order to make normal distribution with zero means and co-
clear what I believe to be the true value of variance matrix M. The model can then be
randomization, let me refer to the model written as Y = a + 3F + -, where - is the sum
of three random variables: U + a* + ,B*F.
Because the error term - is not spherical, the
(1) Yi= a +/3F, + yLi + U1,
proper way to estimate a and /3 is gener-
alized least squares. My 1974 article demon-
where Y1 is the yield of plot i; Fi is the strates that if (a, b) represent the least
fertilizer assigned to plot i; Li is the light squares estimates of (a,,/), then the gener-
falling on plot i; U, is the unspecified in- alized least squares estimates (&,,/) are also
fluence on the yield of plot i, and where ,, equal to (a, b):
the fertilizer effect, is the object of the in-
ferential exercise. We may suppose to begin a
the argument that the light level is expensive (3)
to measure and that it is decided to base an
estimate of /3 initially only on measurement
of Y, and Fi. We may assume also that the and if S represents the sample covariance
natural experiment produces values for Fi, matrix for the least squares estimates, then
Li, and Ui with expected values E(UiIFi) = 0 the sample covariance matrix for (&, /3) is
and E(LiIF,) = ro+ rJFi. In the more familiar
parlance, it is assumed that the fertilizer level (4) Var(Ia, ,B) = S + M,
and the residual effects are uncorrelated,
but the fertilizer level and the light level where M is the covariance matrix of (a*, /B*).
are possibly correlated. As every beginning The meaning of equation (3) is that unless
econometrics student knows, if you omit from one knows the direction of the bias, the
a model a variable which is correlated with possibility of bias does not call for any ad-
included variables, bad things happen. These justment to the estimates. The possibility of
bad things are revealed to the econometri- bias does require an adjustment to the co-
cian by computing the conditional mean of variance matrix (4). The uncertainty is com-
Y given F but not L: posed of two parts: the usual sampling
uncertainty S plus the misspecification un-
(2) E(YIF)=a+1,F+-yE(LIF) certainty M. As sample size grows, the sam-
pling uncertainty S ever decreases, but the
= a+/SF+ y(rO+ r1F) misspecification uncertainty M remains ever
constant. The misspecification matrix M that
(a .,*,8 a*) + , D*\ F we must add to the least squares variance

This content downloaded from 202.28.191.34 on Sun, 22 Mar 2015 11:12:45 UTC
All use subject to JSTOR Terms and Conditions
VOL. 73 NO. I LEAMER: TAKE THE CON OUT OF ECONOMETRICS 33

matrix is just the (prior) variance of the bias randomize and the attempt to measure accu-
coefficients (a*, 3*). If this variance matrix rately ensures that M is small, but not zero,
is small, the least squares bias is likely to be and the difference between scientific experi-
small. If M is large, it is correspondingly ments and natural experiments is difference
probable that (a*, /3*) is large. in degree, but not in kind. Admittedly how-
It would be a remarkable bootstrap if we ever, the misspecification uncertainty in
could determine the extent of the misspecifi- many experimental settings may be so small
cation from the data. The data in fact con- that it is well approximated by zero. This can
tain no information about the size of the very rarely be said in nonexperimental set-
bias, a point which is revealed by studying tings.
the likelihood function. The misspecification Examples may be ultimately convincing.
matrix M is therefore a pure prior concept. There is a great deal of empirical knowledge
One must decide independent of the data in the science of astronomy, yet there are no
how good the nonexperiment is. experiments. Medical knowledge is another
The formal difference between a random- good example. I was struck by a headline in
ized experiment and a natural experiment is the January 5, 1982 New York Times: "Life
measured by the matrix M. If the treatment Saving Benefits of Low-Cholesterol Diet Af-
is randomized, the bias parameters (a*, /*) firmed in Rigorous Study." The article de-
are exactly zero, or, equivalently, the matrix scribes a randomized experiment with a con-
M is a zero matrix. If M is zero, the least trol group and a treated group. "Rigorous"
squares estimates are consistent. If M is not is therefore interpreted as "randomized." As
zero, as in the natural experiment, there re- a matter of fact, there was a great deal of
mains a fixed amount of specification uncer- evidence suggesting a link between heart dis-
tainty, independent of sample size. ease and diet before any experiments were
There is therefore a sharp difference be- performed on humans. There were cross-
tween inference from randomized experi- cultural comparisons and there were animal
ments and inference from natural experi- studies. Actually, the only reason for perfor-
ments. This seems to draw a sharp distinc- ming the randomized experiment was that
tion between economics where randomized someone believed there was pretty clear non-
experiments are rare and "science" where experimental evidence to begin with. The
experiments are routinely done. But the fact nonexperimental evidence was, of course, in-
of the matter is that no one has ever design- conclusive, which in my language means that
ed an experiment that is free of bias, and no the misspecification uncertainty M remained
one can. As it turns out, the technician who uncomfortably large. The fact that the
was assigning fertilizer levels to plots of land, Japanese have both less incidence of heart
took his calculator into the fields, and when disease and also diets lower in cholesterol
he was out in the sun, the calculator got compared to Americans is not convincing
heated up and generated large "random" evidence, because there are so many other
numbers, which the technician took to mean factors that remain unaccounted for. The
no fertilizer; and when he stood under the fact that pigs on a high cholesterol diet de-
shade of the trees, his cool calculator pro- velop occluded arteries is also not convinc-
duced small numbers, and these plots re- ing, because the similarity in physiology in
ceived fertilizer. pigs and humans can be questioned.
You may object that this story is rather When the sampling uncertainty S gets
fanciful, but I need only make you think it is small compared to the misspecification un-
possible, to force you to set M * 0. Or if you certainty M, it is time to look for other forms
think a computer can really produce random of evidence, experiments or nonexperiments.
numbers (calculated by a mathematical for- Suppose I am interested in measuring the
mula and therefore perfectly predictable!), I width of a coin, and I provide rulers to a
will bring up mismeasurement of the fertiliz- room of volunteers. After each volunteer has
er level, or human error in carrying out the reported a measurement, I compute the mean
computer instructions. Thus, the attempt to and standard deviation, and I conclude that

This content downloaded from 202.28.191.34 on Sun, 22 Mar 2015 11:12:45 UTC
All use subject to JSTOR Terms and Conditions
34 THE A MERICAN ECONOMIC RE VIEW MARCII 1983

the coin has width 1.325 millimeters with a mized, it is only that the econometrician has
standard error of .013. Since this amount of misspecified the utility function. The mis-
uncertainty is not to my liking, I propose to specification matrix M thus forms Imre
find three other rooms full of volunteers, Lakatos' "protective belt" which protects
thereby multiplying the sample size by four, certain hard core propositions from falsifi-
and dividing the standard error in half. That cation.
is a silly way to get a more accurate measure-
ment, because I have already reached the II. Is Control Essential?
point where the sampling uncertainty S is
very small compared with the misspecifica- The experimental scientist who notices that
tion uncertainty M. If I want to increase the the fertilizer treatment is correlated with the
true accuracy of my estimate, it is time for light level can correct his experimental de-
me to consider using a micrometer. So too in sign. He can control the light level, or he can
the case of diet and heart disease. Medical allocate the fertilizer treatment in such a way
researchers had more or less exhausted the that the fertilizer level and the light level are
vein of nonexperimental evidence, and it be- not perfectly correlated.
came time to switch to the more expensive The nonexperimental scientist by defini-
but richer vein of experimental evidence. tion cannot control the levels of extraneous
In economics, too, we are switching to influences such as light. But he can control
experimental evidence. There are the labora- for the variable light level by including light
tory experiments of Charles Plott and Vernon in the estimating equation. Provided nature
Smith (1978) and Smith (1980), and there are does not select values for light and values for
the field experiments such as the Seattle/ fertilizer levels that are perfectly correlated,
Denver income maintenance experiment. the effect of fertilizer on yields can be esti-
Another way to limit the misspecification mated with a multiple regression. The collin-
error M is to gather different kinds of nonex- earity in naturally selected treatment vari-
periments. Formally speaking, we will say ables may mean that the data evidence is
that experiment 1 is qualitatively different weak, but it does not invalidate in any way
from experiment 2 if the bias parameters the usual least squares estimates. Here, again,
(ar, f3 ) are distributed independently of the there is no essential difference between ex-
bias parameters (a*, /38). In that event, sim- perimental and nonexperimental inference.
ple averaging of the data from the two
experiments yields average bias parameters III. Are the Degrees of Freedom Inadequate
(a* + a*, /3*+ f3)/2 with misspecification with Nonexperimental Data?
variance matrix M/2, half as large as
the (common) individual variances. Milton As a substitute for experimental control,
Friedman's study of the permanent income the nonexperimental researcher is obligated
hypothesis is the best example of this that I to include in the regression equation all vari-
know. Other examples are hard to come by. ables that might have an important effect.
I believe we need to put much more effort The NBER data banks contain time-series
into identifying qualitatively different and data on 2,000 macroeconomic variables. A
convincing kinds of evidence. model explaining gross national product in
Parenthetically, I note that traditional terms of all these variables would face a
econometric theory, which does not admit severe degrees-of-freedom deficit since the
experimental bias, as a consequence also ad- number of annual observations is less than
mits no "hard core" propositions. Demand thirty. Though the number of observations of
curves can be shown to be positively sloped. any phenomenon is clearly limited, the num-
Utility can be shown not to be maximized. ber of explanatory variables is logically un-
Econometric evidence of a positively sloped limited. If a polynomial could have a degree
demand curve would, as a matter of fact, be as high as k, it would usually be admitted
routinely explained in terms of simultaneity that the degree could be k + 1 as well. A
bias. If utility seems not to have been maxi- theory that allows k lagged explanatory vari-

This content downloaded from 202.28.191.34 on Sun, 22 Mar 2015 11:12:45 UTC
All use subject to JSTOR Terms and Conditions
VOL. 73 NO. 1 LEAMER: TAKE THE CON OUT OF ECONOMETRICS 35

, 2

*3
O F1 F2 Fm
FERTILIZERPER ACRE
I I . _.
FIGURE 2. HYPOTHETICAL DATA AND
O F1 ESTIMATED QUADRATIC FUNCTION

FERTILIZER
PER ACRE
FIGURE 1. HYPOTHETICAL DATA AND several reasons:
THREE ESTIMATED QUADRATIC FUNCTIONS 1) When the farmer tries to buy an
unlimited amount of fertilizer, he will drive
up its price, and the problem should be
ables would ordinarily allow k + 1. If the reformulated to make PF a function of F.
level of money might affect GNP, then why 2) Uncertainty in the fertilizer effect /B
not the number of presidential sneezes, or causes uncertainty in profits, Variance
the size of the polar ice cap? (profits) = p2A2F2Var(B), and risk aversion
The number of explanatory variables is will limit the level of fertilizer applied.
unlimited in a nonexperimental setting, but 3) The yield function is nonlinear.
it is also unlimited in an experimental set- Economic theorists doubtless find reasons
ting. Consider again the fertilizer example in 1) and 2) compelling, but I suspect that the
which the farmer randomly decides either to real reason farmers don't use huge amounts
apply F, pounds of fertilizer per acre or zero of fertilizer is that the marginal increase in
pounds, and obtains the data illustrated in the yield eventually decreases. Plants don't
Figure 1. These data admit the inference that grow in fertilizer alone.
fertilizer level F, produces higher yields than So let us suppose that yield is a quadratic
no fertilizer. But the farmer is interested in function of fertilizer intensity, Y =a+ PI F
selecting the fertilizer level that maximizes + 82 F2 + U, and suppose we have only the
profits. If it is hypothesized that yield is a data illustrated in Figure 1. Unfortunately,
linear function of the fertilizer intensity Y= there are an infinite number of quadratic
a + /BF+ U, then profits are functions all of which fit the data equally
well, three of which are drawn. If there were
Profits = pA (a + /F + U)- PFAF, no other information available, we could
conclude only that the yield is higher at F1
where A is total acreage, p is the product than at zero. Formally speaking, there is an
price, and PF is the price per pound of fertil- identification problem, which can be solved
izer. This profit function is linear in F with by altering the experimental design. The yield
slope A(f/p - PF). The farmer maximizes must be observed at a third point, as in
profits therefore by using no fertilizer if the Figure 2, where I have drawn the least squares
price of fertilizer is high, 18p < PF and using estimated quadratic function and have indi-
an unlimited amount of fertilizer if the price cated the fertilizer intensity Fm that maxi-
is low, 13p> PF. It is to be expected that you mizes the yield. I expect that most people
will find this answer unacceptable for one of would question whether these data admit the

This content downloaded from 202.28.191.34 on Sun, 22 Mar 2015 11:12:45 UTC
All use subject to JSTOR Terms and Conditions
36 THE AMERICAN ECONOMIC REVIEW MARCH 1983

likely than C. What I am revealing is the


a priori opinion that the function is likely to
be smooth and single peaked.
C What should now be clear is that data
alone cannot reveal the relationship between
ui~~~~~~~, yield and fertilizer intensity. Data can reveal
the yield at sampled values of fertilizer inten-
sities, but in order to interpolate between
s~~~~~~~~~
-- I
these sampled values, we must resort to sub-
Jective prior information.
Economists have inherited from the physi-
cal sciences the myth that scientific inference
is objective, and free of personal prejudice.
This is utter nonsense. All knowledge is hu-
FERTILIZER
PER ACRE
man belief; more accurately, human opinion.
FIGURE 3. HYPOTHETICAL
DATA AND What often happens in the physical sciences
THREEESTIMATEDFUNCTIONS is that there is a high degree of conformity of
opinion. When this occurs, the opinion held
by most is asserted to be an objective fact,
inference that the yield is maximized at Fm. and those who doubt it are labelled "nuts."
Actually, after inspection of this figure, I But history is replete with examples of opin-
don't think anything can be inferred except ions losing majority status, with once-objec-
that the yield at F2 is higher than at F,, tive "truths" shrinking into the dark corners
which in turn is higher than at zero. Thus I of social intercourse. To give a trivial exam-
don't believe the function is quadratic. If it is ple, coming now from California I am un-
allowed to be a cubic then again there is an sure whether fat ties or thin ties are aestheti-
identification problem. cally more pleasing.
This kind of logic can be extended indefi- The false idol of objectivity has done great
nitely. One can always find a set of observa- damage to economic science. Theoretical
tions that will make the inferences implied econometricians have interpreted scientific
by a polynomial of degree p seem silly. This objectivity to mean that an economist must
is true regardless of the degree p. Thus no identify exactly the variables in the model,
model with a finite number of parameters is the functional form, and the distribution of
actually believed, whether the data are ex- the errors. Given these assumptions, and
perimental or nonexperimental. given a data set, the econometric method
produces an objective inference from a data
IV. Do We Need Prior Information? set, unencumbered by the subjective opin-
ions of the researcher.
A model with an infinite number of This advice could be treated as ludicrous,
parameters will allow inference from a finite except that it fills all the econometric
data set only if there is some prior informa- textbooks. Fortunately, it is ignored by ap-
tion that effectively constrains the ranges of plied econometricians. The econometric art
the parameters. Figure 3 depicts another hy- as it is practiced at the computer terminal
pothetical sequence of observations and three involves fitting many, perhaps thousands, of
estimated relationships between yield and statistical models. One or several that the
fertilizer. I believe the solid line A is a better researcher finds pleasing are selected for re-
representation of the relationship than either porting purposes. This searching for a model
of the other two. The piecewise linear form B is often well intentioned, but there can be no
fits the data better, but I think this peculiar doubt that such a specification search in-
meandering function is highly unlikely on an validates the traditional theories of inference.
a priori basis. Though B and C fit the data The concepts of unbiasedness, consistency,
equally well, I believe that B is much more efficiency, maximum-likelihood estimation,

This content downloaded from 202.28.191.34 on Sun, 22 Mar 2015 11:12:45 UTC
All use subject to JSTOR Terms and Conditions
VOL. 73 NO. ] LEAMER: TAKE THE CON OUT OFECONOMETRICS 37

in fact, all the concepts of traditional theory, person. I myself have the opinion that
utterly lose their meaning by the time an Andrew Jackson was the sixteenth president
applied researcher pulls from the bramble of of the United States. If many of my friends
computer output the one thorn of a model he agree, I may take it to be a fact. Actually, I
likes best, the one he chooses to portray as a am most likely to regard it to be a fact if the
rose. The consuming public is hardly fooled authors of one or more books say it is so.
by this chicanery. The econometrician's The difference between a fact and an opin-
shabby art is humorously and disparagingly ion for purposes of decision making and
labelled "data mining," "fishing," "grub- inference is that when I use opinions, I get
bing," "number crunching." A joke evokes uncomfortable. I am not too uncomfortable
the Inquisition: "If you torture the data long with the opinion that error terms are nor-
enough, Nature will confess" (Coase). mally distributed because most econometri-
Another suggests methodological fickleness: cians make use of that assumption. This
"Econometricians, like artists, tend to fall in observation has deluded me into thinking
love with their models" (wag unknown). Or that the opinion that error terms are normal
how about: "There are two things you are may be a fact, when I know deep inside that
better off not watching in the making: normal distributions are actually used only
sausages and econometric estimates." for convenience. In contrast, I am quite un-
This is a sad and decidedly unscientific comfortable using a prior distribution, mostly
state of affairs we find ourselves in. Hardly I suspect because hardly anyone uses them.
anyone takes data analyses seriously. Or per- If convenient prior distributions were used as
haps more accurately, hardly anyone takes often as convenient sampling distributions, I
anyone else's data analyses seriously. Like suspect that I could be as easily deluded into
elaborately plumed birds who have long since thinking that prior distributions are facts as I
lost the ability to procreate but not the de- have been into thinking that sampling distri-
sire, we preen and strut and display our butions are facts.
t-values. To emphasize this hierarchy of statements,
If we want to make progress, the first step I display them in order: truths; facts; opin-
we must take is to discard the counterpro- ions; conventions. Note that I have added to
ductive goal of objective inference. The dic- the top of the order, the category truths. This
tionary defines an inference as a logical con- will appeal to those of you who feel com-
clusion based on a set of facts. The "facts" pelled to believe in such things. At the bot-
used for statistical inference about 0 are first tom are conventions. In practice, it may be
the data, symbolized by x, second a condi- difficult to distinguish a fact from a conven-
tional probability density, known as a sam- tion, but when facts are clearly unavailable,
pling distribution, f(x I), and, third, ex- we must strongly resist the deceit or delusion
plicitly for a Bayesian and implicitly for "all that conventions can represent.
others," a marginal or prior probability den- What troubles me about using opinions is
sity function f(0). Because both the sam- their whimsical nature. Some miorningswhen
pling distribution and the prior distribution I arise, I have the opinion that Raisin Bran is
are actually opinions and not facts, a statis- better than eggs. By the time I get to the
tical inference is and must forever remain an kitchen, I may well decide on eggs, or
opinion. oatmeal. I usually do recall that the sixteenth
What is a fact? A fact is merely an opinion president distinguished himself. Sometimes I
held by all, or at least held by a set of people think he was Jackson; often I think he was
you regard to be a close approximation to Lincoln.
all.' For some that set includes only one A data analysis is similar. Sometimes I
take the error terms to be correlated, some-
'This notion of "truth by consensus" is espoused by
times uncorrelated; sometimes normal and
Thomas Kuhn (1962) and Michael Polanyi (1964). Oscar sometimes nonnormal; sometimes I include
Wilde agrees by dissent: "A truth ceases to be true when observations from the decade of the fifties,
more than one person believes it." sometimes I exclude them; sometimes the

This content downloaded from 202.28.191.34 on Sun, 22 Mar 2015 11:12:45 UTC
All use subject to JSTOR Terms and Conditions
38 THE AMERICAN ECONOMIC REVIEW MARCH1983

equation is linear and sometimes nonlinear; individual researchers to perform their own
sometimes I control for variable z, some- sensitivity analyses, and we ought to be de-
times I don't. Does it depend on what I had manding much more complete and more
for breakfast? honest reporting of the fragility of claimed
As I see it, the fundamental problem fac- inferences.
ing econometrics is how adequately to con- The job of a researcher is then to report
trol the whimsical character of inference, how economically and informatively the mapping
sensibly to base inferences on opinions when from assumptions into inferences. In a slogan,
facts are unavailable. At least a partial solu- "The mapping is the message." The mapping
tion to this problem has already been formed does not depend on opinions (assumptions),
by practicing econometricians. A common but reporting the mapping economically and
reporting style is to record the inferences informatively does. A researcher has to de-
implied by alternative sets of opinions. It is cide which assumptions or which sets of al-
not unusual to find tables that show how an ternative assumptions are worth reporting. A
inference changes as variables are added to researcher is therefore forced either to antic-
or deleted from the equation. This kind of ipate the opinions of his consuming public,
sensitivity analysis reports special features of or to recommend his own opinions. It is
the mapping from the space of assumptions actually a good idea to do both, and a seri-
to the space of inferences. The defect of this ous defect of current practice is that it con-
style is that the coverage of assumptions is centrates excessively on convincing one's self
infinitesimal, in fact a zero volume set in the and, as a consequence, fails to convince the
space of assumptions. What is needed in- general professional audience.
stead is a more complete, but still economi- The whimsical character of econometric
cal way to report the mapping of assump- inference has been partially controlled in the
tions into inferences. What I propose to do is past by an incomplete sensitivity analysis. It
to develop a correspondence between regions has also been controlled by the use of con-
in the assumption space and regions in the ventions. The normal distribution is now so
inference space. I will report that all assump- common that there is nothing at all whimsi-
tions in a certain set lead to essentially the cal in its use. In some areas of study, the list
same inference. Or I will report that there of variables is partially conventional, often
are assumptions within the set under consid- based on whatever list the first researcher
eration that lead to radically different in- happened to select. Even conventional prior
ferences. In the latter case, I will suspend distributions have been proposed and are
inference and decision, or I will work harder used with nonnegligible frequency. I am re-
to narrow the set of assumptions. ferring to Robert Shiller's (1973) smoothness
Thus what I am asserting is that the choice prior for distributed lag analysis and to
of a particular sampling distribution, or a Arthur Hoerl and Robert Kennard's (1970)
particular prior distribution, is inherently ridge regression prior. It used to aggravate
whimsical. But statements such as "The sam- me that these methods seem to find public
pling distribution is symmetric and uni- favor whereas overt and complete Bayesian
modal" and "My prior is located at the methods such as my own proposals (1972)
origin" are not necessarily whimsical, and in for distributed lag priors are generally
certain circumstances do not make me un- ignored. However, there is a very good rea-
comfortable. son for this: the attempt to form a prior
To put this somewhat differently, an in- distribution from scratch involves an untold
ference is not believable if it is fragile, if it number of partly arbitrary decisions. The
can be reversed by minor changes in assump- public is rightfully resistant to the whimsical
tions. As consumers of research, we correctly inferences which result, but at the same time
reserve judgment on an inference until it is receptive to the use of priors in ways that
stands up to a study of fragility, usually by control the whimsy. Though the use of con-
other researchers advocating opposite opin- ventions does control the whimsy, it can do
ions. It is, however, much more efficient for so at the cost of relevance. Inferences based

This content downloaded from 202.28.191.34 on Sun, 22 Mar 2015 11:12:45 UTC
All use subject to JSTOR Terms and Conditions
VOL. 73 NO. 1 IEAMER: TAKE THE CON OUT OF ECONOMETRICS 39

on Hoerl and Kennard's conventional "ridge to proceed as if the model were perfectly
regression" prior are usually irrelevant, be- specified, which in my notation means that
cause it is rarely sensible to take the prior to the misspecification matrix M is the zero
be spherical and located at the origin, and matrix. There is only a small risk that when
because a closer approximation to prior be- you present your findings, someone will ob-
lief can be suspected to lead to substantially ject that fertilizer and light level are corre-
different inferences. In contrast, the conven- lated, and there is an even smaller risk that
tional assumption of normality at least uses a the conventional zero value for M will lead
distribution which usually cannot be ruled to inappropriate inferences. In contrast, it
out altogether. Still, we may properly de- would be foolhardy to adopt such a limited
mand a demonstration that the inferences horizon with nonexperimental data. But if
are insensitive to this distributional assump- you decide to include light level in your
tion. horizon, then why not rainfall; and if rain-
fall, then why not temperature; and if tem-
A. The Horizon Problem: Sherlock perature, then why not soil depth, and if soil
Holmes Inference depth, then why not the soil grade; ad in-
finitum. Though this list is never ending, it
Conventions are not to be ruled out alto- can be made so long that a nonexperimental
gether, however. One can go mad trying to researcher can feel as comfortable as an ex-
report completely the mapping from assump- perimental researcher that the risk of having
tions into inferences since the space of as- his findings upset by an extension of the
sumptions is infinite dimensional. A formal horizon is very low. The exact point where
statistical analysis therefore has to be done the list is terminated must be whimsical, but
within the limits of a reasonable horizon. An the inferences can be expected not to be
informed convention can usefully limit this sensitive to the termination point if the
horizon. If it turned out that sensible neigh- horizon is wide enough.
borhoods of distributions around the normal Still, the horizon within which we all do
distribution 99 times out of 100 produced our statistical analyses has to be ultimately
the same inference, then we could all agree troublesome, since there is no formal way to
that there are other more important things to know what inferential monsters lurk beyond
worry about, and we may properly adopt the our immediate field of vision. "Diagnostic"
convention of normality. The consistency of tests with explicit alternative hypotheses such
least squares estimates under wide sets of as the Durbin-Watson test for first-order au-
assumptions is used improperly as support tocorrelation do not truly ask if the horizon
for this convention, since the inferences from should be extended, since first-order au-
a given finite sample may nonetheless be tocorrelation is explicitly identified and
quite sensitive to the normality assumption.2 clearly in our field of vision. Diagnostic tests
The truly sharp distinction between in- such as goodness-of-fit tests, without explicit
ference from experimental and inference alternative hypotheses, are useless since, if
from nonexperimental data is that experi- the sample size is large enough, any main-
mental inference sensibly admits a conven- tained hypothesis will be rejected (for exam-
tional horizon in a critical dimension, namely ple, no observed distribution is exactly nor-
the choice of explanatory variables. If fertil- mal). Such tests therefore degenerate into
izer is randomly assigned to plots of land, it elaborate rituals for measuring the effective
is conventional to restrict attention to the sample size.
relationship between yield and fertilizer, and The only way I know to ask the question
whether the horizon is wide enough is to
study the anomalies of the data. In the words
2In particular, least squares estimates are completely
of the physiologist, C. Bernard:
sensitive to the independence assumption, since by choice
of sample covariance matrix a generalized least squares
estimate can be made to assume any value whatsoever A great surgeon performs operations
(see my 1981 paper). for stones by a single method; later he

This content downloaded from 202.28.191.34 on Sun, 22 Mar 2015 11:12:45 UTC
All use subject to JSTOR Terms and Conditions
40 THE AMERICAN ECONOMIC REVIEW MARCH 1983

makes a statistical summary of deaths turns out to be negative, and you decide to
and recoveries, and he concludes from include in the equation the light level as well
these statistics that the mortality law as the fertilizer level, you are obligated to
for this operation is two out of five. form a prior for the light coefficient y co-n-
Well, I say that this ratio means liter- sistent with the prior for /3*, given that /3* =
ally nothing scientifically, and gives no
certainty in performing the next opera- yrl, where r1 is the regression coefficient of
tion. What really should be done, in- light on fertilizer.3
stead of gathering facts empirically, is This method for discounting the output of
to study them more accurately, each in exploratory data analysis requires a disci-
its special determinism... by statistics, pline that is lacking even in its author. It is
we get a conjecture of greater or less consequently important that we reduce the
probability about a given case, but risk of Holmesian discoveries by extending
never any certainty, never any absolute the horizon reasonably far. The degree of a
determinism... only basing itself on ex- polynomial or the order of a distributed lag
perimental determinism can medicine need not be data instigated, since the horizon
become a true science.
[1927, pp. 137-38] is easily extended to include high degrees
and high orders. It is similarly wise to ask
yourself before examining the data what you
A study of the anomalies of the data is would do if the estimate of your favorite
what I have called "Sherlock Holmes" in- coefficient had the wrong sign. If that makes
ference, since Holmes turns statistical in- you think of a specific left-out variable, it is
ference on its head: "It is a capital mistake better to include it from the beginning.
to theorize before you have all the evidence. Though it is wise to select a wide horizon
It biases the judgements." Statistical theory to reduce the risk of Holmesian discoveries,
counsels us to begin with an elicitation of it is mistaken then to analyze a data set as if
opinions about the sampling process and its the horizon were wide enough. Within the
parameters; the theory, in other words. After limits of a horizon, no revolutionary in-
that, data may be studied in a purely me- ference can be made, since all possible infer-
chanical way. Holmes warns that this biases ences are predicted in advance (admittedly,
the judgements, meaning that a theory con- some with low probabilities). Within the
structed before seeing the facts can be disas- horizon, inference and decision can be turned
trously inappropriate and psychologically over completely to a computer. But the great
difficult to discard. But if theories are con- human revolutionary discoveries are made
structed after having studied the data, it is when the horizon is extended for reasons
difficult to establish by how much, if at all, that cannot be predicted in advance and
the data favor the data-instigated hypothesis. cannot be computerized. If you wish to make
For example, suppose I think that a certain such discoveries, you will have to poke at the
coefficient ought to be positive, and my reac- horizon, and poke again.
tion to the anomalous result of a negative
estimate is to find another variable to in- V. An Example
clude in the equation so that the estimate is
positive. Have I found evidence that the This rhetoric is understandably tiring.
coefficient is positive? It would seem that we Methodology, like sex, is better demon-
should require evidence that is more convinc- strated than discussed, though often better
ing than the traditional standard. I have anticipated than experienced. Accordingly,
proposed a method for discounting such evi- let me give you an example of what all this
dence (1974). Initially, when you regress yield
on fertilizer as in equation (2), you are re-
3In a randomized experiment with r, = 0, the con-
quired to assess a prior distribution for the
straint ,B*= yr1 is irrelevant, and you are free to play
experimental bias parameter /*; that is, you these exploratory games without penalty. This is a very
must select the misspecification matrix M. critical difference between randomized experiments and
Then, when the least squares estimate of /B nonrandomized nonexperiments.

This content downloaded from 202.28.191.34 on Sun, 22 Mar 2015 11:12:45 UTC
All use subject to JSTOR Terms and Conditions
VOL. 73 NO. ] LEAMER: TAKE THE CON OUT OF ECONOMETRICS 41

ranting and raving is about. I trust you will TABLEI -VARIABLESUSEDIN THEANALYSIS
find it even better in the experience than in
the anticipation. A problem of considerable a. Dependent Variable
policy importance is whether or not to have M = Murder rate per 100,000, FBI estimate.
capital punishment. If capital punishment b. Independent Deterrent Variables
PC= (Conditional) Probability of conviction for
had no deterrent value, most of us would murder given commission. Defined by PC=
prefer not to impose such an irreversible C/Q, where C = convictions for murder, Q = M
punishment, though, for a significant minor- - NS, NS = state population. This is to correct
ity, the pure joy of vengeance is reason for the fact that M is an estimate based on a
sample from each state.
enough. The deterrent value of capital PX= (Conditional) Probability of execution given
punishment is, of course, an empirical issue. conviction (average number of executions
The unresolved debate over its effectiveness 1946-50 divided by C).
began when evolution was judging the T= Median time served in months for murder by
survival value of the vengeance gene. Nature prisoners released in 1951.
XPOS = A dummy equal to I if PX > 0.
was unable to make a decisive judgment. c. Independent Economic Variables
Possibly econometricians can. W= Median income of families in 1949.
In Table 1, you will find a list of variables X = Percent of families in 1949 with less than one-
that are hypothesized to influence the murder half W.
U = Unemployment rate.
rate.4 The data to be examined are state-by- LF = Labor force participation rate.
state murder rates in 1950. The variables are d. Independent Social and Environmental Variables
divided into three sets. There are four deter- NW= Percent nonwhite.
rent variables that characterize the criminal AGE = Percent 15-24 years old.
justice system, or in economic parlance, the URB = Percent urban.
MALE = Percent male.
expected out-of-pocket cost of crime. There FAMHO = Percent of families that are husband and
are four economic variables that measure wife both present families.
the opportunity cost of crime. And there SOUTH= A dummy equal to I for southern states
are four social/environmental variables that (Alabama, Arkansas, Delaware, Florida,
Kentucky, Louisiana, Maryland, Missis-
possibly condition the taste for crime. This sippi, North Carolina, Oklahoma, South
leaves unmeasured only the expected re- Carolina, Tennessee, Texas, Virginia, West
wards for criminal behavior, though these Virginia).
are possibly related to the economic and e. Weighting Variable
social variables and are otherwise assumed SQRTNF = Square root of the population of the
FBI-reporting region. Note that weight-
not to vary from state to state. ing is done by multiplying variables by
A simple regression of the murder rate on SQRTNF.
all these variables leads to the conclusion f. Level of Observation
that each additional execution deters thirteen Observations are for 44 states, 35 executing and 9
nonexecuting. The executing states are: Alabama,
murders, with a standard error of seven. Arizona, Arkansas, California, Colorado, Connecti-
That seems like such a healthy rate of return, cut, Delaware, Florida, Illinois, Indiana, Kansas,
we might want just to randomly draft ex- Kentucky, Louisiana, Maryland, Massachusetts, Mis-
ecutees from the population at large. This sissippi, Missouri, Nebraska, Nevada, New Jersey,
proposal would be unlikely to withstand New Mexico, New York, North Carolina, Ohio,
Oklahoma, Oregon, Pennsylvania, South Carolina,
the scrutiny of any macroeconomists who South Dakota, Tennessee, Texas, Virginia, Washing-
are skilled at finding rational expectations ton, West Virginia.
equlibria. The nonexecuting states are: Idaho, Maine, Min-
The issue I would like to address instead is nesota, Montana, New Hampshire, Rhode Island,
Utah, Wisconsin, Wyoming.
whether this conclusion is fragile or not.
Does it hold up if the list of variables in the
model is changed? Individuals with different
experiences and different training will find different subsets of the variables to be
candidates for omission from the equation.
4This material is taken from a study by a student of Five different lists of doubtful variables are
mine, Walter McManus (1982). reported in Table 2. A right winger expects

This content downloaded from 202.28.191.34 on Sun, 22 Mar 2015 11:12:45 UTC
All use subject to JSTOR Terms and Conditions
42 THE AMERICAN ECONOMIC RE VIEW MARCCH1983

TABLE 2-ALTERNATIVE PRIOR SPECIFICATIONS

Prior PC PX T XPOS W X U LF NW AGE URB MALE FAMHO SOUTH

Right Winger I I I * D D D D D D D D D D
RationalMaximizer I I I * I I I I D D D D D D
Eye-for-an-Eye I I D * D D D D D D D D D D
BleedingHeart D D D * I I I I D D D D D D
Crime of Passion D D D * I I I I I I I I I I

Notes: 1) I indicates variables considered important by a researcher with the respective prior. Thus, every model
considered by the researcher will include these variables. D indicates variables considered doubtful by the researcher.
* indicates XPOS, the dummy equal to I for executing states. Each prior was pooled with the data two ways: one
with XPOS treated as important, and one with it as doubtful.
2) With five basic priors and XPOS treated as doubtful or important by each, we get ten alternative prior
specifications.

the punishment variables to have an effect, TABLE 3-EXTREME ESTIMATESOF THE EFFECTOF
but treats all other variables as doubtful. He EXECUTIONS ON MURDERS

wants to know whether the data still favor


Minimum Maximum
the large deterrent effect, if he omits some of Prior Estimate Estimate
these doubtful variables. The rational maxi-
mizer takes the variables that measure the Right Winger - 22.56 - .86
expected economic return of crime as im- Rational Maximizer - 15.91 - 10.24
portant, but treats the taste variables as Eye-for-an-Eye - 28.66 1.91
Bleeding Heart - 25.59 12.37
doubtful. The eye-for-an-eye prior treats all Crime of Passion - 17.32 4.10
variables as doubtful except the probability
of execution. An individual with the bleeding Note: Least squares is - 13.22 with a standard error of
heart prior sees murder as the result of eco- 7.2.
nomic impoverishment. Finally, if murder is
thought to be a crime of passion then the
punishment variables are doubtful. I come away from a study of Table 3 with
In Table 3, I have listed the extreme esti- the feeling that any inference from these data
mates that could be found by each of these about the deterrent effect of capital punish-
groups of researchers. The right-winger min- ment is too fragile to be believed. It is possi-
imum of -22.56 means that a regression of ble credibly to narrow the set of assump-
the murder rate data on the three punish- tions, but I do not think that a credibly large
ment variables and a suitably selected linear set of alternative assumptions will lead to a
combination of the other variables yields an sharp set of estimates. In another paper
estimate of the deterrent effect equal to 22.56 (1982), I found a narrower set of priors still
lives per execution. It is possible also to find leads to inconclusive inferences. And I have
an estimate of -.86. Anything between these ignored the important simultaneity issue (the
two extremes can be similarly obtained; but death penalty may have been imposed in
no estimate outside this interval can be gen- crime ridden states to deter murder) which is
erated no matter how the doubtful variables often a source of great inferential fragility.
are manipulated (linearly). Thus the right
winger can report that the inference from VI. Conclusions
this data set that executions deter murders is
not fragile. The rational maximizer similarly After three decades of churning out esti-
finds that conclusion insensitive to choice of mates, the econometrics club finds itself un-
model, but the other three priors allow ex- der critical scrutiny and faces incredulity as
ecution actually to encourage murder, possi- never before. Fischer Black writes of "The
bly by a brutalizing effect on society. Trouble with Econometric Models." David

This content downloaded from 202.28.191.34 on Sun, 22 Mar 2015 11:12:45 UTC
All use subject to JSTOR Terms and Conditions
VOL. 73 NO. I LEAMER: TAKE THE CON OUT OF ECONOMETRICS 43

Hendry queries "Econometrics: Alchemy or Kuhn, Thomas S., The Structure of Scientific
Science?"John W. Pratt and Robert Schlaifer Revolutions, Chicago: University of Chica-
question our understanding of "The Nature go Press, 1962.
and Discovery of Structure." And Chris- Lakatos,Imre,"Falsification and the Method-
topher Sims suggests blending "Macroeco- ology of Scientific Research Programmes,"
nomics and Reality." in his and A. Musgrave, eds., Criticismand
It is apparent that I too am troubled by the Growth of Knowledge, Cambridge:
the fumes which leak from our computing Cambridge University Press, 1969.
centers. I believe serious attention to two Leamer, EdwardE., "A Class of Prior Dis-
words would sweeten the atmosphere of tnbutions and Distributed Lag Analysis,"
econometric discourse. These are whimsy and Econometrica, November 1972, 40, 1059-
fragility. In order to draw inferences from 81.
data as described by econometric texts, it is , "False Models and Post-data Model
necessary to make whimsical assumptions. Construction," Journal American Statisti-
The professional audience consequently and cal Association, March 1974, 69, 122-31.
properly withholds belief until an inference , Specification Searches: Ad Hoc In-
is shown to be adequately insensitive to the ference with Non-experimental Data, New
choice of assumptions. The haphazard way York: Wiley, 1978.
we individually and collectively study the , "Techniques for Estimation with In-
fragility of inferences leaves most of us un- complete Assumptions," IEEE Conference
convinced that any inference is believable. If on Decision and Control, San Diego, De-
we are to make effective use of our scarce cember 1981.
data resource, it is therefore important that , ' Sets of Posterior Means with
we study fragility in a much more systematic Bounded Variance Priors," Econometrica,
way. If it turns out that almost all inferences May 1982, 50, 725-36.
from economic data are fragile, I suppose we McManus, Walter, "Bayesian Estimation of
shall have to revert to our old methods lest the Deterrent Effect of Capital Punish-
we lose our customers in government, busi- ment," mimeo., University of California-
ness, and on the boardwalk at Atlantic City. Los Angeles, 1981.
Plott, Charles R. and Smith, Vernon L., "An
Experimental Examination of Two Ex-
change Institutions," Review of Economic
REFERENCES Studies, February 1978, 45, 133-53.
Polanyi, Michael, Personal Knowledge, New
Bernard, C., An Introduction to the Study of York: Harper and Row, 1964.
Experimental Method, New York: Mac- Pratt, John W. and Schlaifer, Robert, " On
Millan, 1927. the Nature and Discovery of Structure,"
Black,Fischer,"The Trouble with Economet- mimeo., 1979.
ric Models," Financial Analysts Journal, Shiller,Robert,"A Distributed Lag Estimator
March/April 1982, 35, 3-11. Derived From Smoothness Priors," Econ-
Friedman,Milton, A Theory of the Consump- ometrica, July 1973, 41, 775-88.
tion Function, Princeton: Princeton Uni- Sims, C. A., "Macroeconomics and Reality,"
versity Press, 1957. Econometrica, January 1980, 48, 1-48.
Hendry,David, "Econometrics-Alchemy or , "Scientific Standards in Economet-
Science?,"Economica, November 1980, 47, ric Modeling," mimeo., 1982.
387-406. Smith, VernonL., "Relevance of Laboratory
Hoerl,ArthurE. andKennard,RobertW., " Ridge Experiments to Testing Resource Alloca-
Regression: Biased Estimation for Nonor- tion Theory," in J. Kmenta and J. Ramsey,
thogonal Problems," Technometrics, Feb- eds., Evaluation of Econometric Models,
ruary 1970, 12, 55-67. New York: Academic Press, 1980, 345-77.

This content downloaded from 202.28.191.34 on Sun, 22 Mar 2015 11:12:45 UTC
All use subject to JSTOR Terms and Conditions

Вам также может понравиться