Вы находитесь на странице: 1из 13

Psychological Bulletin 1991, VbUlO, No.

3,486-498

Copyright 1991 by the American Psychological Association. Inc. 0033-2909/91/J3.00

Costs and Benefits of Judgment Errors: Implications for Debiasing


Hal R. Arkes Ohio University
Some authors questioned the ecological validity of judgmental biases demonstrated in the laboratory. One objection to these demonstrations is that evolutionary pressures would have rendered such maladaptive behaviors extinct if they had any impact in the "real world." I attempt to show that even beneficial adaptations may have costs. 1 extend this argument to propose three types of judgment errorsstrategy-based errors, association-based errors, and psychophysical based errorseach of which is a cost of a highly adaptive system. This taxonomy of judgment behaviors is used to advance hypotheses as to which debiasing techniques are likely to succeed in each category.

During the last two decades, cognitive psychologists documented many types of judgment and decision-making errors. Spurred in good measure by the work of Tversky and Kahneman (1974), the research area has come to be known as "judgment under uncertainty." The relatively poor performance of subjects in many of these judgment experiments has caused some researchers to question the ecological validity of such studies (e.g., Berkeley & Humphreys, 1982; Edwards, 1983; Funder, 1987; Hogarth, 1981; Phillips, 1983). The reasoning seems to be that because subjects' performance is so poor in these experiments, it may not be representative of their behavior in more naturalistic environments in which people seem quite competent. One purpose of the present article is to argue that even successful adaptations can have costs. This position is common in biology. Its application to the areas of judgment and decision making will, I hope, help explain why particular judgment behaviors persist despite their obvious drawbacks in some situations. My goal is to show that the costs of otherwise beneficial cognitive adaptations are the consequence of appropriate responses to environmental demands. A second goal of this article is to propose a taxonomy of judgment behaviors based on the nature of the adaptational costs. The use of this taxonomy may help suggest what type of debiasing techniques may be effective in each category of judgment behavior.

bly be adaptive, there is reason for psychologists to assume that these maladaptive behaviors exist mainly in artificial laboratory environments and not in naturalistic ecologies. However, Archer (1988) pointed out that successful adaptations have costs as well as benefits. Consider first a physiological example. In dangerous situations, the body mobilizes for a fight or flight response. This is highly adaptive. However, prolonged stress will result in serious physical deterioration. This is a maladaptive long-term consequence of a response that is generally beneficial. Hence, stomach ulceration is not grounds for deeming the alarm reaction to be maladaptive. A phylogenetic example is the emergence of upright gait. Although upright gait has resulted in epidemic levels of lower back pain in humans, it has resulted in substantial adaptive consequences (e.g, freeing the hands for tool use). The benefit outweighs the cost. A psychological example is provided by the costs and benefits of expertise (Arkes & Freedman, 1984; Arkes & Harkness, 1980). Experts have substantial background knowledge that they can draw on to instantiate the missing slots of incomplete schemata. They subsequently demonstrate a tendency to recall the instantiated information as having been presented when in fact it was not. For example, Arkes and Harkness (1980) showed that speech therapy students who made a diagnosis of Down's syndrome tended to remember having seen the symptom "fissured tongue." However, this common symptom of Down's syndrome had not been presented in the actual list of symptoms. Because nonexperts have less background knowledge, they are less likely to make this type of error. Thus, even though we all agree that expertise is beneficial, it does have its costs. In an analogous way, the presence of widespread, maladaptive judgment strategies is not necessarily contrary to the principles of evolution. (See also Enhorn & Hogarth, 1981, p. 58.) I divide the judgment and decision-making errors documented in the literature into three broad categories. Each category of errors is a cost of an otherwise adaptive system. First, I

Costs and Benefits From an Evolutionary Perspective


To examine maladaptive judgment behaviors, we need to consider what makes any characteristic adaptive. Viewed from an evolutionary perspective, adaptive behaviors contribute to reproductive success, although the route from good judgment performance to reproductive success may be quite indirect. Because it is not obvious how serious judgment errors could possi-

I am grateful to Bruce Carlson and Robyn Dawes for their helpful suggestions on an earlier draft of this article. Daniel Kahneman and two anonymous reviewers also provided very constructive comments. Correspondence concerning this article should be addressed to Hal R. Arkes, Department of Psychology, Ohio University Athens, Ohio 45701.

present a brief overview before describing each category in more detail. Strategy-based errors occur when subjects use a suboptimal strategy; the extra effort required to use a more sophisticated strategy is a cost that often outweighs the potential benefit of

486

COSTS AND BENEFITS OF JUDGMENT ERRORS

487

enhanced accuracy. Hence, decision makers remain satisfied with the suboptimal strategy in low-stakes situations. Association-based errors are costs of otherwise highly adaptive system of associations within semantic memory. The automaticity of such associations, generally of enormous benefit, becomes a cost when judgmental!}' irrelevant or counterproductive semantic associations are brought to bear on the decision or judgment. Psychophysically based errors result from the nonlinear mapping of physical stimuli onto psychological responses. Such errors represent costs incurred in less frequent stimulus ranges where very high and very low stimulus magnitudes are located. These costs are more than offset by sensitivity gains in the more frequent stimulus ranges located in the central portion of the stimulus spectrum. I now present a more detailed description of each of these three categories of judgment errors.

Tom does want to date this woman.

Tom does not want to date this woman.

The woman has a good sense of humor.

cell a

cellb

The woman does not have a good sense of humor.

cede

celld

Figure 1. Example of information presented to subjects in the study by Harkness, DeBono, and Borgida (1985). (From "Personal Involvement and Strategies for Making Contingency Judgments: A Stake in the Dating Game Makes a Difference" by A. R. Harkness, K. G. DeBono, and E. Borgida, 1985, Journal of Personality and Social Psychology, 49, P- 25. Copyright 1985 by the American Psychological Association. Adapted by permission.)

Three Types of Judgment Errors


Strategy-Based Judgment Errors

Despite the fact that poor judgment performance has been demonstrated in a large number of situations (Kahneman, Slovic, & Tversky, 1982), evidence exists that some suboptimal behaviors may be adaptive in a larger sense. Suppose a person adopts a quick and dirty strategy to solve a problem. Because it is quick, it is easy to execute. This is a benefit. Because it is dirty, it results in more errors than a more meticulous strategy. This is a cost. Although the choice of this strategy may result in fewer correct answers compared with the other strategy, this cost may be outweighed by the benefit of time and effort saved. Thorngate (1980) and Johnson and Payne (1985) compared the performance of various decision strategies with regard to their ability to select alternatives with the highest expected value. Some strategies were quite rudimentary. For example, some completely ignored the probability of an outcome and only considered the average payoff for each possible choice. It was found that many of the 10 decision strategies performed well. Even the primitive ones selected alternatives with the highest expected value under some circumstances and almost never selected alternatives with the lowest. The use of such elementary strategies would be drastically less taxing to the human information-processing system than would the use of more complete but complicated ones. Hence, if a suboptimal strategy were to be used, the large savings in cognitive effort might far outweigh the small loss in potential outcomes. This point was stressed a number of years ago by Beach and Mitchell (1978). When subjects know that the stakes are high, they often can change from a suboptimal strategy to a better one. It is worth it for them to do so. For example, Harkness, DeBono, and Borgida (1985) asked undergraduates to perform a covariation estimation task. Female subjects examined data that described other women whom Tom did or did not want to date and whether each of these women possessed a particular characteristic. The presented data could be summarized as entries into a 2 X 2 matrix, such as the one presented in Figure 1. For example, the rows of the matrix might be labeled "The woman has a

good sense of humor" and "The woman does not have a good sense of humor." The columns might be labeled "Tom does want to date this woman" and "Tom does not want to date this woman." Subjects considered these data to decide how much Tom's liking for a woman covaried with a characteristic such as her sense of humor. Using a procedure developed by Shaklee and Tucker (1980), Harkness et al. (1985) were able to determine the complexity of the strategy used by the female subjects as they performed this covariation estimation task. In some groups, the strategy used was rather elementary. This was not the case, however, if the man whose preferences were being examined was someone the female subject would be going out with for the next 3 to 5 weeks. In this group, the women used complex covariation estimation strategies significantly more often. This finding is consistent with Payne's (1982) description of contingent decision behavior. The decision behavior is contingent on such factors as the reward for high levels of accuracy. Researchers who are optimistic about human judgment and decision-making performance point out that sensitivity to such factors as incentive, task complexity (Billings & Marcus, 1983; Paquette & Kida, 1988), and time pressure (Christensen-Szalanski, 1980; Payne, Bettman, & Johnson, 1988) is highly adap-

Association-Based Judgment Errors Experiments in which semantic memory has been primed have become very common during the last 20 years. The principal result of such studies is that priming causes an activation of concepts related to the prime. A number of models, such as HAM (Anderson & Bower, 1973) and ACT* (Anderson, 1983) among many others, posited spreading activation as a fundamental characteristic of semantic memory. Consider a study by Kubovy (1977). When subjects were asked to report "the first digit that comes to mind," only 2.2%

488

HAL R. ARKES certainly not adaptive if what we mean by "adaptive" implies correspondence to reality (e.g, the actual number of men and women in the list). However, I believe that such errors are a consequence of the normal operation of long-term memory: Priming content with related items or by asking the person to perform a cognitive activity will result in heightened retrievability, which can result in nonveridical estimates of frequency and probability. Again, such errors are a cost of a memory system whose principles of association and retrieval produce benefits far in excess of these costs.2 I very briefly enumerate several such errors. Explanation bias. Ross, Lepper, Strack, and Steinmetz (1977) asked subjects to read a short scenario about a person and then explain why this individual might have eventually done some specified behavior, such as contributing money to the Peace Corps or committing suicide. Subjects were assured that the to-be-explained outcome was entirely hypothetical, because it was not known what actually happened to this individual. Participants subsequently rated the probability that the person actually did each of several behaviors. Ross et al. (1977) found that subjects assigned higher probabilities to the outcome that they had explained. Making an option more available can make the option seem more probable. Hindsight bias. A judgment error closely related to availability is the hindsight bias (Fischhoff, 1975). In hindsight we tend to exaggerate the likelihood that we would have been able to predict the event beforehand. Of course, the event that did occur and its possible causes are far more available than events that never occurred. For example, after the election has taken place, subjects say that they had assigned higher probability to the winner's prospects than they actually had assigned before the election occurred (Powell, 1988). Ignoring P(D\H). Consider a former who wishes to determine if there is a relation between cloud seeding and rain. Available evidence includes the entries in a 2 x 2 table, the rows of which are "seeding" and "no seeding" and the columns of which are "rain" and "no rain." Of course, the farmer needs to examine the numbers of entries in each cell to arrive at a correct conclusion. However, many investigators found that subjects who are trying to determine the relation between cloud seeding and rain often do not consider as relevant the evidence in the "no-seeding" row (e.g, Arkes & Harkness, 1983). Similarly, students are usually astonished to learn that to determine the relation between a symptom and a disease one needs to collect data on the likelihood of a symptom when the disease is not present. Fischhoff and Beyth-Marom (1983) deemed to be a "meta-bias" the general tendency to ignore data when the hypothesis is not true or when the possible antecedent cause is not present.

chose the digit 1. When subjects were asked to report the "first one-digit number that comes to mind," 18% chose 1. This result is consistent with the tenets of spreading activation. The second group of subjects is much more likely to respond with a 1 because that digit was primed by the request to report a one-digit number. If we consider the first group to be the control group because they were asked the question in a more neutral manner, should we then consider the response of the second group to be a manifestation of bias? Have these subjects made a judgment error? The relatively high probability of reporting the digit 1 as the first one-digit number to come to mind is a consequence of the spreading activation characteristic of semantic memory. The fact that related concepts in semantic memory influence each other through this activation is essential to normal cognitive functioning.1 The benefits of spreading activation include some of the most fundamental cognitive tasks: stimulus generalization, inference, and transfer of training, for example. These substantial benefits of spreading activation are accompanied by a cost, which Kubovy (1977) demonstrated, namely, the inability of humans to prevent associated items from influencing their cognition even when those related items are irrelevant or counterproductive to judgmental accuracy An experiment by Gilovich (1981) serves as an example from the judgment literature. Newspaper sportswriters rated the potential of various hypothetical college players to become professional football players. If a college player was said to have come from the same hometown as a current professional football player, the college player was rated much higher than if he grew up in some other town. If we assume that one^ hometown has little to do with one's potential as a professional football player, then we must attribute the higher rating to the fact that the mere association between the to-be-rated player and the successful professional player was responsible for the higher rating. Another example is provided by Gregory, Cialdini, and Carpenter (1982). People who were instructed to imagine experiencing certain events subsequently rated those events as more likely to occur to them compared with subjects who did not previously imagine them. Gregory et al. (1982) explained their results in terms of availability (Tversky & Kahneman, 1973). Through the activity of imagining, items are made more available in long-term memory. As a result such items are judged to be more probable. Whereas Kubovy (1977) increased the availability of the digit 1 by mentioning it in an unobtrusive manner, Gregory et al. were able to increase the availability of scenarios by blatantly asking subjects to imagine their occurrence. In both cases the experimenters exploited the normal working of the memory system to heighten the retrieval of some item, thereby creating an "error." Application of this same principle can result in other judgment errors, as when Tversky and Kahneman (1973) presented a list of famous women and not-so-famous men to a group of subjects. Although the list contained more men than women, the subjects erroneously claimed that the list contained more women. The notoriety of the women heightened their availability in memory and, thus, their retrievability. The many judgment errors that have been demonstrated in this manner are

1 Ratcliffand McKoon (1981) questioned the validity of spreading activation theories of semantic memory They did not question the validity of the findings that have spawned such theories, however. I believe the judgment errors I attribute to associative mechanisms could be explained by either the spreading activation or compound cue (Ratcliff& McKoon, 1981) theories. 2 TVersky and Kahneman (1974) also noted that heuristics such as availability have benefits as well as costs.

COSTS AND BENEFITS OF JUDGMENT ERRORS

489

The reason why this meta-bias exists is that the hypothesized cause, but not its absence, primes relevant data. If I believe that disease D causes symptoms S, it seems obvious to ascertain the status of iS when D is present. The absence of D does not prime S; as a result, many people do not believe that the status of S needs to be ascertained in such cases. Confirmation bias. I define confirmation bias as a selective search, recollection, or assimilation of information in a way that lends spurious support to a hypothesis under consideration. Some authors put this term in quotation marks to denote that it refers to a rather loosely related group of findings. (Fischhoff & Beyth-Marom, 1983, even suggested abandoning the term because of its imprecise referent.) Confirmation bias was demonstrated by Chapman and Chapman (1967) using the Draw-a-Person Test. This is a projective instrument in which a patient draws a picture of a person, and a clinician then examines the picture for particular cues that supposedly are associated with various types of psychopathology. Because there was negligible evidence favorable to the validity of this technique, the Chapmans thought that associations between the features of the drawings and the purported diagnosis must be entirely illusory. To test this hypothesis, drawings of people were randomly paired with personality traits presumably characteristic of the person who did the drawings. Clinicians and undergraduates who viewed these drawings perceived correlations between certain drawing features and the personality traits of the person who drew the figure. For example, subjects claimed that drawings containing large eyes were frequently done by people who were said to be suspicious. Drawings with muscular figures were frequently said to be done by men who were concerned about their manliness. The Chapmans concluded that because there was no real correlation between these drawing features and personality traits, subjects must be relying on preexisting associations in perceiving this association. To test this hypothesis, the Chapmans performed a follow-up study. Undergraduates were asked to rate the strength of semantic association between the body parts emphasized in the various drawings and the personality traits said to be characteristic of the drawers. In this follow-up study, the subjects rated eyes as closely associated with suspiciousness, for example. This is precisely the illusory correlation detected by subjects in the first study: They had incorrectly reported that suspiciousness was characteristic of the persons who drew figures with large eyes. This is an illustration of the confirmation bias because the subjects assimilated the evidence in a biased way based on their preconceived association between eyes and suspiciousness. This study is quite similar to the one by Gilovich (1 98 1 ) in that a prior association results in an inappropriate consideration of the evidence. In this case, the inappropriate consideration serves to bolster the prior association. Pseudodiagnosticity. Bayes's theorem may be expressed in the following way:

P(H,)P(DI\H1)

where H and D signify the hypotheses and data, respectively, and the subscript i indexes a set of data. Assume that the two hypotheses are mutually exclusive and exhaustive.

Suppose subjects are given the choice of examining one pair of data to determine /"(H,). They may choose to learn P(/>, IH^ and P(Di |H2); they may choose P(D2|H,) and P(D2|H2); or they may choose P(D, |H,) and P(D2|H,). It may be seen by examining Bayes's theorem that to infer the probability of H,, the choice of either of the first two pairs would be helpful. Choosing the last pair will generally not provide diagnostic information. However, the members of the last pairP(Dl |H,) and P(D1\Hl)are both strongly cued when H, is considered. This may be why these two nondiagnostic data are selected by so many subjects in a study by Doherty, Mynatt, Tweney, and Schiavo (1979). These investigators termed this nonoptimal judgment behavior pseudodiagnosticity. Overconfidence. One of the most robust findings in the judgment and decision-making literature is overconndence (Lichtenstein, Fischhoff, & Phillips, 1982). Koriat, Lichtenstein, and Fischhoff (1980) suggested that a primary reason for unwarranted confidence is that subjects can generate supporting reasons for their decisions much more readily than contradictory ones. The supporting reasons are more strongly cued. For example, suppose I am asked whether Oslo or Leningrad is further north, and I answer "Oslo." Now I am asked to assign a confidence level to my answer. To complete this task, I search my semantic memory for the information that made Oslo seem like the correct answer. Items pertaining to Oslo's nearby glaciers and fjords are much more strongly cued than information concerning Oslo's summer warmth. The evidence I am most likely to retrieve thus is an unrepresentative sample of all available evidence, and my confidence is thereby inappropriately inflated. Because of the increase in the confidence with which an opinion is held, the process leading to Overconfidence is related to the confirmation bias. Representativeness. To appreciate the nature of this heuristic, it may be instructive first to consider the term overgenemlization (Slobin, 1971). We admire the intelligence of the child who generalizes the past tense verb ending "ed" to the unfamiliar verb "revel," thereby making "reveled." We think less highly of the child who generalizes the same past tense ending to the verb "do," thereby making "doed." We call the latter behavior overgeneralization, even though it seems to be a manifestation of the same very fundamental principle we call generalization. Of course, those of us who are aware of the existence of irregular verbs can be arrogant with children about what constitutes overgeneralization of an inferential strategy outside its domain of appropriate application. Overgeneralization is a judgment error, but again I think this is a small cost of an otherwise adaptive associationistic system. The representativeness heuristic (Tversky & Kahneman, 1974) provides an example of such overgeneralization. This heuristic refers to the fact that people often judge probabilities on the basis of similarity, or representativeness. For example, in judging whether Instance A belongs to Class B, people often rely on the extent to which A seems representative of B. Of course, the probability that A belongs to B can be influenced by many factors that have no bearing on representativeness. For example, basing one's decision solely on representativeness will result in the underutilization of base rates (Kahneman & Tversky, 1973), thereby resulting in errors.

490

HAL R. ARKES

Theories of category classification as old as that of Hull (1920) are based on the principle that decisions concerning the category membership of an exemplar are based on the degree of similarity between the exemplar and the category More recent models, such as the feature-comparison model of Smith, Shoben, and Rips (1974), share this assumption. For example, to the extent cardinal shares features with the category clergyman, it is likely to be deemed a member of that category If cardinal shares fewer features with the category bird than the category clergyman, it is deemed likely to belong to the latter category even though there are many more birds than clergymen in the world. This feature-matching process ignores base rates of the two categories; hence, it is prone to error. Judgments of similarity follow one of the most fundamental principles of cognition: stimulus generalization. It is highly adaptive that we associate items to other items with which they are related. Even a task as basic as classic conditioning requires this. The fact that cardinal is more closely associated with clergyman than bird may bode very poorly for our consideration of base rates. However, I consider this cost to be an overgeneralization of a process that serves us very well in other contexts. Thus, I suggest that the manifestation of the representativeness heuristic is another example of a cost of an otherwise highly adaptive associationistic system. Figure 2. The value function of prospect theory (Kahneman & Tversky, 1979). (See text for discussion. From "Prospect Theory: An Analysis of Decision Under Risk" by D. Kahneman and A. Tversky, 1979, Econametrica, 47, p. 279. Copyright 1979 by Basil Blackwell Ltd. Adapted by permission.)

Psychophysicatty Based Errors From psychophysical power functions (Stevens, 1957) to prospect theory's S-shaped curve (Kahneman & Tverskx 1979), from the original Weber-Fechner log function to economists' law of diminishing returns, many theorists postulated an asymptotic curve relating external stimuli (e.g, mass, cash, light intensity) and the psychological responses to those stimuli. Figure 2 depicts the value function of prospect theory (Kahneman & Tversky, 1979), which represents one such nonlinear curve. A system that translated physical intensity in a linear manner onto psychological response would impose an immense cost on any transduction system. Extreme stimuli, which occur relatively infrequently, would have to be coded with as great a level of discriminability as the more frequent middle-range stimuli. Any nonlinear system with an asymptote at the extreme end (or ends) would have the benefit of eliminating the structures and processes needed to discriminate small changes in rare events, such as very intense sounds or extremely heavy weights. Of course, sacrificing discriminability at the ends of the continuum has a cost. An experiment by Dinnerstein (1965) illustrates this point. Dinnerstein presented subjects with a series of weights and found that subjects' ability to discriminate was finest at the center of the range of stimuli. Then a weight was introduced that was either above or below all the others. This caused the region of maximal discriminability to either rise or drop depending on whether the new weight was heavy or light. This result occurred even though the new weight was not included in the range of stimuli to be rated. This study demonstrates the adaptation of the nervous system to the available stimulus array, an adaptation that allows the perceiver to extract the optimal amount of useful information from each situation. Of

course, extracting the optimal amount of useful information is highly adaptive even though diminished sensitivity at the extremes is a cost. Several judgment errors may be a manifestation of this particular cost. I briefly enumerate several. Sunk cost effect. Economic decisions should be made based on the anticipated costs and benefits that will result from the choice of each of the alternative courses of action. Note that future costs and benefits are relevant; prior (sunk) costs are not. A judgment error occurs when sunk costs are used as a basis for decision making. The sunk cost effect is manifested in a willingness to continue spending after an investment of time, effort, or money has already been made (Arkes & Blumer, 1985). Persons who have already invested substantial amounts and who have not yet realized compensatory returns are at Point B in Figure 2. Persons in that situation are not very sensitive to further losses; a small subsequent expenditure of funds will therefore cause negligible psychological disutility Hence, such persons are willing to "throw good money after bad" in a desperate attempt to recoup their sunk cost, even though such behavior may be irrational. If a particular project is a poor idea, the fact that it has already wasted a lot of money does not make it a better idea. Yet the sunk cost effect has been shown to be powerful (Arkes & Blumer, 1985). Psychophysics of spending. Once a person has decided to purchase a new car, for example, he or she is located in the

COSTS AND BENEFITS OF JUDGMENT ERRORS

491

asymptotic region of a curve describing the psychophysics of spending. Persons in this situation would be more willing to pay $235 for a car radio compared with their willingness to buy a radio for $235 if they had not purchased a car. We have good discriminability (Le, the curve is steep) in the region of a few hundred dollars on either side of our current state. Once we are several thousand dollars away from our current state, discriminability drops (i., the curve flattens), and we no longer object to extravagant additional expenditures (Christensen, 1989). Reflection effect. Tversky and Kahneman (1981, p. 45 3) demonstrated how framing the outcomes of the same gamble as losses or as gains can lead to different decisions. They referred to this phenomenon as the reflection effect, which is illustrated in their following well-known example (p. 453):
Imagine that the U.S. is preparing for the outbreak of an unusual Asian disease, which is expected to kill 600 people. Two alternative programs to combat the disease have been proposed. Assume that the exact scientific estimates of the consequences of the programs are as follows: If Program X is adopted, 200 people will be saved. If Program Y is adopted, there is a one-third probability that 600 people will be saved and a two-thirds probability that no people will be saved.

The benefit of saving 200 lives is located at Point X in Figure 2. The benefit of saving 600 lives is located at Point Y. Program Y represents a relatively small gain in value over Program X. Two hundred lives saved is so great a benefit that the additional lives that might be saved under Program Y are too small an additional benefit to warrant the risk of saving no one. Hence, about three fourths of the subjects chose Program X. Other subjects were asked to consider two other programs:
If Program B is adopted, 400 people will die. If Program A is adopted, there is a one-third probability that nobody will die and a two-thirds probability that 600 people will die.

The loss of 400 lives is located at Point B in Figure 2. The loss of 600 lives is located at Point A. Because the loss of 400 lives is so terrible, the loss of 200 additional lives represents only a small additional loss in value. Hence, about three fourths of the subjects chose Program A. The potential extra loss in value by choosing that program was more than offset by the chance of saving everyone. It is easy to see that Program X, which is generally endorsed, is the same as Program B, which is generally shunned. If the value function were strictly linear, this inconsistency would not occur. By asking some subjects to consider the problem in terms of lives gained, Tversky and Kahneman (1981) exploited the small superiority of Y over X, a superiority which is not sufficient to warrant additional risk. By asking other subjects to consider the problem in terms of lives lost, Tversky and Kahneman (1981) exploited the small inferiority of A over B, an inferiority small enough to warrant additional risk. The wording or framing of the problem directs subjects to different portions of the nonlinear curve. Anchoring The terms anchoring and anchoring and adjustment were popularized in the judgment and decision-making

literature by Tversky and Kahneman (1974), although they were discussed earlier (e.g., Slovic & Lichtenstein, 1971). First, let us consider the psychophysical research pertaining to "induction illusions" or "context effects." We know from many experiments on adaptation level theory (Helson, 1964) that when a medium-size circle is placed in a group of much larger ones, the medium one is perceived as small. This constellation of circles induces an adaptation level that is approximately at the mean value of the circles' areas, and the medium circle has an area that is below this mean. When this same medium-sized circle is placed in a context of much smaller circles, it is perceived as large. Now its area is above the adaptation level. The shift in the adaptation level is consistent with the principle discussed previously by Dinnerstein (1965): It is best to have the adaptation level change location to locate maximal discriminability near the center of the stimulus continuum. Note that this adaptation is congruent with the relation between the physical and psychological dimensions depicted in Figure 2. Point O is the current state, which is in the area of maximal discriminability. The asymptotes are in areas of diminished discriminability. Consider an analogous judgment experiment (Sutherland, Dunn, & Boyd, 1983). Sixty-four hospitalized patients rated five different health states using three different methods. In the first method, subjects assigned values to these five health states on a scale anchored by perfect health and death. In the second method, perfect health was replaced on the high end of the scale by a health state each rater had rated less desirable. Thus, the high end of the scale was no longer quite as high. In the third method, death was replaced on the low end of the scale by a health state each rater had rated more desirable. Thus, the low end of the scale was no longer quite as low. Relative to the values assigned to the various health states using the first method, the values assigned to the very same health states using the second method were lower, and those assigned using the third method were higher. This study showed that a patient's rating of possible health states was strongly influenced by the context in which such stimuli were considered. Such context-dependent effects appear to induce inconsistent ratings just as the medium-sized circle was rated differently depending on the size of the circles with which it could be compared. However, this is a small cost of an otherwise beneficial adaptation designed to extract the optimal amount of useful information out of each situation. Another group of judgment studies is more closely related to a context phenomenon demonstrated by Restle (1971). Subjects were presented with a drawing like that in Figure 3. Restle

E
Figure 3.

E
Stimulus used in study by Restle (1971).

492

HAL R. ARKES may have more than one cause. Hence, it would not be appropriate to categorize the bias as belonging to a category of judgment error. Instead, the various causes of the bias may be categorized according to the taxonomy presented previously. Perhaps the best example of this situation is the conjunction fallacy. Tversky and Kahneman (1983) suggested that the representativeness heuristic is one basis for this fallacy. However, they also pointed out in an earlier article (Tversky & Kahneman, 1974) that the anchoring and adjustment heuristic may play a role in the manifestation of this fallacy in some instances (e.g., Bar-Hillel, 1973). Tversky and Kahneman (1983, p. 312) contended that speakers' conformity with Gricean (1975) conversational rules could hinder appreciation of the probabilistic law relevant to the consideration of conjunctions. Worse yet, Yates and Carlson (1986) suggested that individual subjects may use multiple procedures in arriving at their answers on different conjunction problems depending on the presence of various factors. Thus, it would be a mistake to place the conjunction fallacy itself into only one of the categories of judgment errors. It would be proper, however, to place each of the various causes of the fallacy into one of the categories. Thus, the taxonomy does not divide the judgment errors into mutually exclusive categories. I suggest that the causes can be so divided. Whether the categories are exhaustive with regard to the causes of judgment errors remains to be determined. To return to the example of the conjunction fallacy, anchoring and adjustment is a psychophysically based error. Representativeness is an association-based error as is the overgeneralization of Gricean principles to probability estimates. If the environment contains cues that foster one of these "incriminating" behaviors, then the fallacy will occur.

varied the length of the horizontal test line (H), the length of the vertical center line (C), which crossed the test line, and the length of the identical vertical ends lines (E). It was expected that judgments of the length of H should decrease as C or E increased. It was easy for Restle to determine the influence of E and C on H by ascertaining the slope of the function relating E to H and C to H. Restle (1971) presented subjects with one of two possible sets of instructions. One group was told, "Pay attention to the vertical lines at the ends of the test line, and use them as a frame of reference to help you in your judgments." These subjects were also warned, "Try to disregard the center vertical line." Other subjects were told just the opposite: They were to use the center vertical line as a frame of reference and to ignore the end lines." The result was that the line that subjects were told to attend to was much more influential on the subjects' judgment of the test stimulus than was the line they were told to ignore. This study differs from the study involving the circles in that instructions are used to direct the subject's attention to the reference point, which serves as the context for the ensuing judgment. Flexibility of frames of reference, which introductory psychology students first appreciate when they view a Necker cube, is essential to recognize the same object in different contexts. However, this immense benefit has a cost, and many anchoring and adjustment studies illustrate this cost. In the most famous such study, Tversky and Kahneman (1974) asked subjects to estimate the percentage of African countries in the United Nations. Subjects spun a "rigged" spinner, which landed on either 10% or 65% as a starting point. Subjects were then were asked to adjust the starting number to the level they thought was appropriate to answer the question correctly The median estimate for those who started with 10% was 25%, whereas the median estimate for those who started with 65% was 45%. By directing the subject's attention to a starting point or anchor, Tversky and Kahneman (1974) did something analogous to what Restle (1971) asked his subjects to do. When 10% is presented as the anchor, a context is induced that contains low numbers. Adjustments upward move toward the area of maximal discriminability in the central region of the spectrum. Because such adjustments in this region are perceived to be quite significant, subjects often refrain from making them as large as would be warranted. This results in the insufficient adjustment observed by Tversky and Kahneman (1974). Of course, the opposite result occurs when the subject's attention is drawn to the large anchor at the beginning of the experiment. Many other studies illustrate the influence of the anchor in analogous judgment situations (e.g., Northcraft & Neale, 1987). It is true that different anchors and the subsequent insufficient adjustment result in different final estimates given different anchors. I suggest that this "irrationality" is a worthwhile cost to achieve context-dependent judgment behavior.

Debiasing Strategy-Based Errors


Bias may not be an appropriate term to use to describe suboptimal behaviors in this category, and thus debiasing would not be an appropriate term to use to describe the adoption of strategies that result in higher accuracy levels. Suboptimal behaviors occur in this category because the effort or cost of a more diligent judgment performance is greater than the anticipated benefit. The way to improve judgment within this category is to raise the cost of using the suboptimal judgment strategy. Typically this results in the judge's utilization of the currently available data in a much more thorough way, an obviously superior strategy. Consider first the study by Harkness et al. (1985) alluded to earlier and depicted in Figure 1. The investigators identified four covariation strategies. The first, the Cell A strategy, consists of noting how many times Tom liked a woman who had a good sense of humor. The covariation between Tom's liking for the woman and her sense of humor is based on the number of times he wanted to date such a person. The second strategy, A minus B, consists of comparing Cells A and B. To the extent A exceeds B, Tom is judged to like women with a sense of humor. Note that these two strategies do not use Cells C and D. The third strategy, sum of diagonals, compares the sum of A and D

Multiple Causes
To this point I have identified various biases, for example, the hindsight bias, as belonging to one of the three categories of judgment errors. However, some phenomena we term biases

COSTS AND BENEFITS OF JUDGMENT ERRORS

493

with that of B and C. To the extent the former sum exceeds the latter, Tom is judged to like women with a sense of humor. The final strategy, conditional probability is the normative assessment of covariation. These final two strategies use the data in all four cells. Harkness et al. found that 6 of the 11 women who were given information about Tom but who would not be dating him used one of the two primitive strategies. None of the 11 women who thought they would be going out with Tom used these simple strategies; they all used one of the two sophisticated covariation estimation strategies. If a subject in a lowstakes judgment environment were using only a subset of the available data, it would be obvious how to improve one's judgment should the stakes increase: Use more data. An analogous finding is exemplified in a study by Petty and Cacioppo (1984). Undergraduates were exposed to three or nine arguments that were all of either high or low quality. The arguments related to an issue that would be of importance to a group of undergraduates: The institution of a new policy in one year under which all students had to pass a comprehensive exam in their major field to graduate from the university. Needless to say, this was the "high-involvement" group. The "low-involvement" subjects also evaluated either three or nine strong or weak arguments, but the issue concerned the adoption of this new policy at a time long after this group of students would graduate. Petty and Cacioppo found that the low-involvement subjects were more persuaded by nine arguments than by three arguments. The strength of the argument was not a significant factor. For the high-involvement subjects, the strength of the arguments rather than their mere number was significant. If subjects are not concerned with a proposition, merely counting the arguments in support of it might be sufficient. If the stakes are raised, an obvious strategy is available the benefits of which are substantial: Consider the merits of the arguments. In both the Harkness et al. (1985) and the Petty and Cacioppo (1984) studies, the presence of higher stakes resulted in less superficial treatment of the data available as the basis of a decision. Tetlock and Kim (1987) found that the same end could be accomplished through slightly different means. All subjects were presented with the responses of an actual test taker to the first 16 items of a personality test. Based on these responses, subjects were first asked to write a brief personality sketch of the test taker. Then subjects were then asked to predict how the test taker would answer 16 new items. One group of subjects was told beforehand that they would be interviewed by the experimenter to learn how the subjects went about making their predictions. These accountability subjects wrote more complex personality sketches, were more accurate in their predictions of how the test takers would answer the next 16 items, and expressed less overconfidence in their predictions. Knowing that they would be held accountable for their predictions raised the stakes for the subjects, which caused them to interact with the stimulus materials in a less superficial way. Cursory interactions with currently available data cause strategy-based errors, and incentive promotes the adoption of a more thorough strategy. Association-Based Errors The influence of incentives in eliminating association-based errors is negligible, as illustrated in an experiment by Fischhoff,

Slovic, and Lichtenstein (1977). In this study, subjects assigned confidence levels to their answers to two-option questions, such as "Aden was occupied in 1839 by the (a) British or (b) French." If the analysis of this situation by Koriat et al. (1980) is correct, subjects search for reasons to support their answer. This search instills a high (and inappropriate) level of confidence. Fischhoff et al. (1977, Experiment 4) wanted to find out how intransigent this overconfidence was. Subjects were asked to wager actual money based on the confidence levels they had assigned to their answers. About 93% of the subjects agreed to wager in a game that would have been biased in their favor if their confidence levels had been appropriate for their level of accuracy I assume that the prospect of winning or losing substantial amounts of cash based on their stated confidence levels would cause subjects to scrutinize the basis of these stated levels. This additional, highly motivated scrutiny apparently led the vast majority of subjects to conclude that their stated confidence levels were justified. Nevertheless, 36 of the 39 subjects would have lost money in this game, because their high levels of confidence were not justified. Incentives are not effective in debiasing association-based errors because motivated subjects will merely perform the suboptimal behavior with more enthusiasm. An even more assiduous search for confirmatory evidence will not lower onels overconfidence to an appropriate confidence level.3 Fischhoff (1975) and others tried a direct approach to debias the hindsight effect: Tell the subjects about the bias and then warn them not to succumb to it. If the mechanism responsible for the hindsight bias is the memorial priming of an outcome by its actual occurrence, then exhortations to prevent this priming will generally not be effective because the priming of associations between related items probably occurs automatically (Neely, in press; Ratcliff& McKoon, 1981). That is, priming is unconscious and occurs with negligible capacity usage (Posner & Snyder, 1975). It would be difficult for subjects to abort a cognitive process that occurs outside of their awareness. "Please prevent associated items from influencing your thinking" would be a curious entreaty unlikely to accomplish much debiasing. There is a long history of research in cognitive psychology that demonstrates that the occurrence of automatic processes can be maladaptive. The most commonly cited example is the Stroop effect (Stroop, 1935). Subjects are shown words and are asked to name as quickly as possible the color of the ink in which the word is printed. When the word itself is the name of some color, such as red, and the ink is a different color, subjects experience difficulty in suppressing the tendency to announce red rather than the color of the ink. The activation of a word in semantic memory is "too automatic" for the subject to perform the Stroop task with facility. In an analogous way, association-based judgment errors are a

3 Thaler (1986), among others, also noted that increasing the incentive for rational behavior does not always result in heightened rationality This presents a problem for economists who hope that the irrationalities documented by psychologists in questionnaire studies will disappear when financial incentives for rational behavior are introduced.

494

HAL R. ARKES

small cost of an otherwise adaptive association-based semantic memory system. These errors occur when items semantically related to the judgment influence it even when their influence is not conducive to increased accuracy. To diminish an association-based judgment error, neither the introduction of incentives nor entreaties to perform well will necessarily cause subjects to shift to a new judgment behavior. Instead, it will be more helpful to instruct the subjects in the use of a behavior that will add or alter associations. Instructions to perform a debiasing behavior. On the basis of earlier research by Slovic and Fischhoff (1977) and by Koriat et al. (1980), Arkes, Faust, Guilmette, and Hart (1988) presented neuropsychologists with a small case history and then asked them to state the probability that each of three possible diagnoses was correct. The estimates of these subjects comprised the foresight estimates. Other neuropsychologists were told that one of the diagnoses was correct and that they should estimate the probability they would have assigned to the three diagnoses if they did not know which one correct. These hindsight subjects exhibited a bias by assigning a higher probability level to the "correct" diagnosis than did the foresight subjects. However, hindsight subjects who had to state one reason supporting each of the diagnoses before making their probability estimates manifested no hindsight bias. The behavior of considering evidence supportive of an outcome that did not occur is unlikely to be performed by subjectswhatever their motivationunless they are asked to do so. The consequence of performing this behavior is lowering the inappropriate confidence one has in the accuracy of one's responses and reducing the magnitude of the hindsight effect. Koriat et al. (1980) found that this technique was effective in reducing the overconfidence people generally have in their answers to general knowledge questions, and Hoch (1985) found the same technique was able to lower overconfidence in forecasts made by business students. Note that this "consider the opposite" strategy (Lord, Lepper, & Preston, 1984) attempts to debias by priming stimuli other than the ones that would normally be accessed. Once this priming occurs, new causal skids are greased. The consequent influence of these new factors will occur according to the same mechanisms that led to the bias (e.g., hindsight, confirmation, overconfidence) in the first place. If the occurring event cued its own causal chains, then considering the nonoccurring event ought to accomplish the analogous result, thereby reducing the bias. Another type of debiasing has been effective against overconfidence, a bias that I have postulated is a consequence of cuing mainly supportive evidence. Murphy and Winkler (1974) found that weather forecasters have outstanding accuracy-confidence calibration. For example, there is rain on 90% of the days on which meteorologists say there is a 90% chance of rain. However, Wagenaar and Keren (1986) showed that meteorologists were very overconfident in their answers to general knowledge questions. This suggests that these professionals have not learned some general debiasing strategy like "consider the opposite," which they can then apply to domains outside their area of expertise. Instead, the absence of overconfidence for this and a very few other select groups of professionals in their area of

expertise is due to the fact that they get rapid feedback on a very large number of predictions the confidence level of which is carefully recorded. Of course, daily feedback on the appropriateness of one's confidence is a debiasing technique almost never available to most people. Cuing a debiasing behavior. Rather than instructing people in a different judgment behavior, it is possible to merely cue such a behavior. An example is provided by the research program of Nisbett, Krantz, Jepson, and Kunda (1983), who were interested in discovering independent variables that would foster the use of an appropriate statistical inference technique by college students. Tversky and Kahneman (1974) showed that many subjects did not use such inference techniques in many instances; therefore, the subjects' judgments were incorrect. Hence, the imposition on subjects of any of the effective independent variables discovered by Nisbett et al. for these tasks would constitute debiasing. For example, subjects in their third study were presented with the story of David, a high school senior who had to choose between a small liberal arts college and an Ivy League university. Several of David's friends who were attending one of the two schools provided information that seemed to favor quite strongly the liberal arts college. However, a visit by David to each school provided him with contrary information. Should David rely on the advice of his many friends (a large sample) or on his own 1-day impressions of each school (a very small sample)? Other subjects were given the same scenario with the addition of a paragraph that made them "explicitly aware of the role of chance in determining the impression one may get from a small sample" (Nisbett et al., 1983, p. 353). Namely, David drew up a list for each school of the classes and activities that might interest him during his visit there, and then he blindly dropped a pencil on the list, choosing to do those things on the list where the pencil point landed. These authors found that if the chance factors that influenced David's personal evidence base were made salient in this way, subjects would be more likely to answer questions about the scenario in a probabilistic manner (i.e, rely on the large sample provided by many friends) than if the chance factors were not made salient. Such hints, rather than blatant instruction, can provide routes to a debiasing behavior in some problems. Confidence as a second-order judgment. If I estimate the potential of college football players, the proportion of men in a list of people, or the merit of program X in combating an Asian disease, I would be performing a first-order judgment task. If I am called on to express my confidence in any of those judgments, then I am performing a second-order judgment task. By stating my confidence, I am rendering a judgment about my first-order judgment. Debiasing techniques aimed at the first-order judgment can also have salutary effects on overconfidence. For example, Tetlock and Kim (1987) found that subjects who knew that they would be held accountable for their predictions and thus were in a high-stakes situation wrote complex personality sketches of the person whose test they were reviewing. They also made more accurate predictions concerning these test takers than did subjects in a low-stakes situation. This study was used to illustrate the fact that incentives can improve strategy-based judgments. However, Tetlock and Kim found that the accountabil-

COSTS AND BENEFITS OF JUDGMENT ERRORS ity group also was less overconfident in their judgments than the control group. I hypothesize that this result was because accountability subjects more thoroughly studied all available evidence compared with the control group. As a consequence, the usualfindingoverconfidencewasameliorated. Subjects typically have to be instructed to consider evidence that is contrary to their decision to bring their confidence down to reasonable levels (Koriat et al., 1980). However, because confidence is a second-order judgment, attempts at improving the first-order judgment may also have a beneficial effect in debiasing overconfidence. (See also Arkes, Christensen, Lai, & Blumen 1987, Experiment 2).

495

Psychophysically Based Errors


Techniques effective in debiasing psychophysically based errors are much different than those effective in debiasing association-based errors. However, we must first consider what constitutes bias. Suppose that I am in line at the betting window before the last race of the day at a race track. The man in front of me bemoans his terrible luck on the prior 11 races and decides to put all his remaining funds ($50) on a long shot on the last race. Because he lost all 11 of the prior races, we assume that he was at point B in Figure 2. The loss of $50 would not represent a significant decrease in utility The gain of several thousand would represent an enormous increase. Calculations based on the curve in Figure 2 might indicate that his behavior was " rational"; he was maximizing expected utility. If the upper portion of Figure 2 described the relation between objective light intensity and his subjective brightness judgments, it would not have been sensible for me to say, "Sir, I respectfully point out that your judgments do not increase in a linear fashion with objective intensity. Therefore, your judgment is biased, and you might do well to correct your responses." For the same reason, it would not have been sensible for me to point out that his betting behavior was biased. Given the psychophysics of his situation, his behavior followed in an unbiased manner. Because there was no bias, there was no warrant to debias. Kahneman and Tversky (1984) pointed out that when subjects are presented with their inconsistent answers on the two versions of the Asian disease problem, many do not want to resolve the inconsistency by changing one of their answers. They can be made to realize that the inconsistency is present. However, they apparently do not consider their answers to be nonnormative or "biased." This suggests how difficult debiasing may be on psychophysically based errors. Suppose a person's own psychophysical function is not used as a basis for making a decision. Instead, a benevolent other may wish to impose a "normative" framework, thereby changing the original person's response. An example might be an accountant who, through his or her professional training, realizes that the manifestation of the sunk cost effect will have adverse effects on the economic well-being of everyone in a company If he or she wants to debias the sunk cost effect, what avenues are possible? Incentives, which are effective in debiasing strategy-based errors, are ineffective in debiasing psychophysically based

errors. Sunk cost reasoning has been used to justify continued funding of multihillion dollar "lemons" (Arkes & Blumer, 1985). In addition, it has been used to justify continued spending on the exceedingly expensive B-2 bomber (Staff, 1989, p. 8). If saving billions of dollars is not a sufficient monetary incentive, then we may conclude that the sunk cost effect is not particularly vulnerable to this type of debiasing. It is not clear how changing the judgment behavior to add associations, a technique effective in debiasing associationbased errors, would even be applied here. I am unaware of any such attempts. To debias psychophysically based errors, at least four techniques may be effective, however. First, because the curve relating objective gains and losses to subjective gains and losses cannot be changed, debiasing may occur when new gains or losses are added to those currently under consideration. This will change the location of the possible outcomes on the curve. For example, Northcraft and Neale (1986) asked subjects to consider spending more money on a project that appeared to be doomed to failure. Subjects in the control group tended to continue to spend, thereby manifesting the sunk cost effect. Other subjects were informed of the presence of opportunity costs. This term refers to the fact that money spent on the doomed project is unavailable for use on much more promising ventures. These superior investments represent lost opportunities if the funds are spent elsewhere. Revealing to subjects the presence of this huge additional cost made the choice of the sunk cost option much less attractive, and fewer subjects succumbed to the sunk cost effect. This situation can be understood by referring again to Figure 2. Consider the subjects who had not been made aware of the opportunity cost of a further small investment in a hopeless cause. Should the investment prove to be unsuccessful, the location of the subjects would shift from point B (their current position) to point A, a small loss in utility However, those subjects who had been made aware of the opportunity cost of continued investment in a lost cause would realize that such behavior would actually result in a shift from point B to point Q. This represents a larger loss of psychological utility Presenting subjects with information concerning opportunity costs decreased subjects' willingness to continue investing. A second way to modify psychophysical judgment behaviors is to change the concatenation of related items. For example, Thaler (1985) asked subjects to decide whether Mr. A or Mr. B was happier. Mr. A won two lotteries, one for $50 and one for $25. Mr. B won a single lottery of $75. The large majority of subjects thought that Mr. A would be happier. Mr. A would receive two separate winnings, and because of the concavity of the value function in the region of gains, the sum of the values of the two winnings would be greater than value of their sum, that is v(25) + v(50) > v(25 + 50). Thaler (1985) pointed out that late-night mail-order advertisements take advantage of this principle by tossing in a tool set, knives, and other separable items to make the gain look particularly large. Adding more of the same product would merely push the potential buyer along the asymptote where value increases quite slowly. Thus, segregating versus integrating gains can cause changes in one's willingness to make purchases.

496

HAL R. ARKES be trained professionally or to seek someone who is so trained is a meta-strategy that will ameliorate some judgment errors.

A third technique is to change one's reference point. When someone is 100,000 calories in arrears on a diet, one is at point A. Efforts to move to the right and upward on the scale will not result in much improvement in the immediate future. However, if one begins the diet anew, then one is transposed to point O. Here improvement is easier to achieve thanks to the shape of the curve in the region of the origin. Maxims like "Today is the first day of the rest of your life" use this principle. (A related economic analysis can be found in Loewenstein, 1988). Fourth, one can reframe losses as gains (or gains as losses) as was accomplished by Tversky and Kahneman (1981) in their Asian disease example. (Also see McNeil, Pauker, Sox, & Tversky, 1982.) Note that these techniques do nothing to alter the shape of the psychophysical curve. Psychophysically based judgment errors occur because the relation between external stimuli and psychological responses to those stimuli is nonlinear. Because the shape of the curve depicting this relation is a given, debiasing consists of changing either the location of the options or the location of one's reference point on the curve.

Conclusion
In his excellent review of the debiasing literature, Fischhoff (1982, p. 444) suggested that clarifying and exploiting the cognitive processes underlying debiasing are major theoretical and practical tasks. The debiasing literature currently contains a desultory catalog of techniques that work, techniques that do not work, and techniques that work on some tasks but not others. The purpose of this article was to divide judgment behaviors into three broad categories based on functionalist criteria, namely the bases for their costs and benefits. With this taxonomy, it is then possible to hypothesize which variables are likely to be effective in debiasing judgment errors within each category Strategy-based errors occur when the cost of extra effort outweighs the potential benefit of extra accuracy. Given this premise, debiasing should occur when the benefits of accurate judgment are increased. Association-based errors are costs of otherwise highly adaptive system of associations within semantic memory Errors occur when semantically related but judgmentally harmful associations are brought to bear on the task. Debiasing requires the performance of a behavior that will activate different associations. Psychophysically based errors are due to the nonlinear relation between external stimuli and the subjective responses to those stimuli. Debiasing therefore requires changing the location of one's position on the curve depicting this relation or the position of one or more of the options. Twenty years of extremely creative research have documented the presence of many judgment shortcomings. It is hoped that this taxonomy will help in the search for techniques with which we will be able to debias such errors.

Training
At least one type of debiasing lies outside the categorization scheme just presented. Its success is not due to its ability to counteract the cognitive behaviors characteristic of strategybased, association-based, or psychophysically based judgment errors. This type of debiasing is professional training. When examining the financial state of a company, an accountant is unlikely to fall prey to the sunk cost effect. Standard accounting procedures simply allow no place for the consideration of sunk costs. From a psychological perspective, this is not a very interesting instance of debiasing. However, it is instructive that quite specific professional training may be necessary for debiasing to be successful. Arkes and Blumer (1985) showed that taking a course or two in general economics did not inoculate students against the sunk cost effect. Another example of this same type of debiasing is more general and therefore much more interesting. Lehman, Lempert, and Nisbett (1988) showed that graduate training can influence subjects' statistical reasoning. For example, Lehman et al. showed that the importance of control groups is more likely to be apprehended by advanced psychology graduate students than by advanced chemistry students. The superiority of the psychology students may represent a result similar to the presumed superiority of the accountants in resisting the sunk cost effect. Namely, professional training in the techniques of psychological research heightened their awareness of the importance of control groups. Instruction in standard accounting procedures or scientific methodology represent examples of providing people with tools designed to reach a normatively appropriate answer. Edwards and von Winterfeldt (1986) pointed out that "if the problem is important and the tools are available people will use them and thus get right answers" (p. 679). Indeed, training involves giving certain (usually self-selected) people precisely those tools needed to arrive at correct answers. The decision to

References Anderson, J. R. (1983). The architecture of cognition. Cambridge, MA: Harvard University Press.
Anderson, J. R., & Bower, G. H. (1973). Hitman associative memory. Washington, DC: Winston. Archer, J. (1988). The sociobiology of bereavement: A reply to Littlefield and Rushton. Journal of Personality and Social Psychology, 5}, 272-278. Arkes, H. R..& Blumer, C. (1985). The psychology of sunk cost. Organizational Behavior and Human Decision Processes, 35,125-140. Arkes, H. R., Christensen, C, Lai, C, & Blumer, C. (1987). Two methods of reducing overconfidence. Organisational Behavior and Human Performance, 39,133-144. Arkes, H. R., Faust, D., Guihnette, T. J., & Hart, K. (1988). Eliminating the hindsight bias. Journal of Applied Psychology, 73, 305-307. Arkes, H. R., & Freedman, M. R. (1984). A demonstration of the costs and benefits of expertise in recognition memory. Memory & Cognition, 12. 84-89. Arkes, H. R., & Harkness, A. R. (1980). The effect of making a diagnosis on the subsequent recognition of symptoms. Journal of Experimental Psychology: Human Learning and Memory, 6, 568-575. Arkes, H. R., & Harkness, A. R. (1983). Estimates of contingency be-

COSTS AND BENEFITS OF JUDGMENT ERRORS tween two dichotomous variables. Journal of Experimental Psychology: General, 112,117-135. Bar-Hillel, M. (1973). On the subjective probability of compound events. Organizational Behavior and Human Performance, 9, 396406. Beach, L. R, & Mitchell, T. R. (1978). A contingency model for the selection of decision strategies. Academy of Management Review, 3, 439-449. Berkeley, D., & Humphreys, P. (1982). Structuring decision problems and the "bias heuristic." Acta Psychologica, SO, 201-252. Billings, R. S, & Marcus, S. A. (1983). Measures of compensatory and noncompensatory models of decision behavior: Process tracing versus policy capturing. Organizational Behavior and Human Performance. 31, 331-352. Chapman, L., & Chapman, J. (1967). Genesis of popular but erroneous psychodiagnostic observations. Journal of Abnormal Psychology, 72, 193-204. Christensen, C. (1989). The psychophysics of spending. Journal of Behavioral Decision Making, 2, 69-80. Christensen-Szalanski, J. J. J. (1980). A further examination of the selection of problem-solving strategies: The effects of deadlines and analytic aptitudes. Organizational Behavior and Human Performance, 25,107-122. Dinnerstein. D. (1965). Intermanual effects of anchors on zones of maximal sensitivity in weight-discrimination. American Journal of Psychology, 78, 66-74. Doherty, M. E., Mynatt, C. R., Tweney, R. D, & Schiavo, M. D (1979). Pseudodiagnosticity. Acta Psychologica, 43,111-121. Edwards, W (1983). Human cognitive capabilities, representativeness, and ground rules for research. In P. C. Humphreys, O. Svenson, & A. Vari (Eds.), Analysing and aiding decision processes (pp. 507-513). Amsterdam: North-Holland. Edwards, W, & von Winterfeldt, D. (1986). On cognitive illusions and their implications. In H. R. Arkes & K. R. Hammond (Eds.), Judgment and decision making: An interdisciplinary reader (pp. 642-679). Cambridge, England: Cambridge University Press. Einhorn, H. J, & Hogarth, R. M. (1981). Behavioral decision theory: Processes of judgment and choice. Annual Review of Psychology, 32, 53-88. Fischhoff, B. (1975). Hindsight * foresight: The effect of outcome knowledge on judgment under uncertainty. Journal of Experimental Psychology: Human Perception and Performance, 1, 288-299. Fischhoff, B. (1982). Debiasing. In D. Kahneman, P. Slovic, & A. Tversky, (Eds.), Judgment tinder uncertainty: Heuristics and biases (pp. 422-444). Cambridge, England: Cambridge University Press. Fischhoff, B., & Beyth-Marom, R. (1983). Hypothesis evaluation from a Bayesian perspective. Psychological Review, 90,239-260. Fischhoff, B, Slovic, P, & Lichtenstein, S. (1977). Knowing with certainty: The appropriateness of extreme confidence. Journal of Experimental Psychology: Human Perception and Performance, 3, 552564. Funder, D. C. (1987). Errors and mistakes: Evaluating the accuracy of social judgment. Psychological Bulletin, 101, 75-90. Gilovich, T. (1981). Seeing the past in the present: The effect of associations to familiar events on judgments and decisions. Journal of Personality and Social Psychology, 40, 797-808. Gregory, W L., Cialdini, R. B, & Carpenter, K. M. (1982). Self-relevant scenarios as mediators of likelihood estimates and compliance: Does imagining make it so? Journal ojPersonality'andSocial'Psychology, 43, 88-99. Grice, H. P. (1975). Logic and conversation. In D. Davidson & G. Harman (Eds.), The logic of grammar (pp. 64-75). Encino, CA: Dickinson.

497

Harkness, A. R., DeBono, K. G., & Borgida, E. (1985). Personal involvement and strategies for making contingency judgments: A stake in the dating game makes a difference. Journal of Personality and Social Psychology, 49, 22-32. Helson,H.(l964).Adaptation-leveltheory:Anexperimentalandsystematic approach to behavior. New York: Harper. Hoch, S. J. (1985). Counterfactual reasoning and accuracy in predicting personal events. Journal of Experimental Psychology: Learning, Memory, andCognition, 11, 719-731. Hogarth, R. M. (1981). Beyond discrete biases: Functional and dysfunctional aspects of judgmental heuristics. Psychological Bulletin, 90,197-217. Hull, C. L. (1920). Quantitative aspects of the evolution of concepts. Psychological Monographs, 28, (1, Whole No. 123). Johnson, E. J., & Payne, J. W (1985). Effort and accuracy in choice. Management Science, 31, 395-414. Kahneman, Q, Slovic, P., & Tversky, A. (Eds). (1982). Judgment under uncertainly: Heuristics and biases. Cambridge, England: Cambridge University Press. Kahneman, D, & Tversky, A. (1973). On the psychology of prediction. Psychological Review, 80, 237-251. Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47, 263-291. Kahneman, D., & Tversky, A. (1984). Choices, values, and frames. American Psychologist, 39, 341-350. Koriat, A., Lichtenstein, S., & Fischhoff, B. (1980). Reasons for confidence. Journal of Experimental Psychology: Human Learning and Memory, 6,107-118. Kubovy, M. (1977). Response availability and the apparent spontaneity of numerical choices. Journal of Experimental Psychology: Human Perception and Performance, 3, 359-364. Lehman, D. R., Lempert, R. O, & Nisbett, R. E. (1988). The effects of graduate training on reasoning: Formal discipline and thinking about everyday-life events. American Psychologist, 43, 431-442. Lichtenstein, S., Fischhoff, B., & Phillips, L. D. (1982). Calibration of probabilities: The state of the art to 1980. In Kahneman, D, Slovic, R, & Tversky, A. (Eds.), Judgment under uncertainly: Heuristics and biases (pp. 306-354). Cambridge, England: Cambridge University Press. Loewenstein, G. F. (1988). Frames of mind in intertemporal choice. Management Science, 34, 200-214. Lord, C. G., Lepper, M. R., & Preston, E. (1984). Considering the opposite: A corrective strategy for social judgment. Journal of Personality and Social Psychology, 47,1231-1243. McNeil, B. J, Pauker, S. J, Sox, H. C, Jr., & Tversky, A. (1982). On the elicitation of preferences for alternative therapies. New England Journal of Medicine, 306,1259-1262. Murphy, A. H., & Winkler, R. L. (1974). Subjective probability forecasting experiments in meteorology: Some preliminary results. Bulletin of the American Meteorological Society, 55,1206-1216. Neely, J. H. (in press). Semantic priming effects in visual word recognition: A selective review of current findings and theories. In D. Besner &G. Humphreys (Eds), Basic processes in reading.-Visual word recognition. Hillsdale, NJ: Erlbaum. Nisbett, R. E, Krantz, D. H., Jepson, C, & Kunda, Z. (1983). The useof statistical heuristics in everyday inductive reasoning. Psychological Review, 90, 339-363. Northcraft, G. B., & Neale, M. A. (1986). Opportunity costs and the framing of resource allocation decisions. Organizational Behavior and Human Decision Processes, 37, 348-356. Northcraft, G. B, & Neale, M. A. (1987). Experts, amateurs, and real estate: An anchoring-and-adjustment perspective on property pric-

498

HAL R. ARKES ment. Organizational Behavior and Human Performance, 6, 649744. Smith, E. E., Shoben, E J., & Rips, L. J. (1974). Structure and process in semantic memory: A feature model for semantic decisions. Psychological Review, 81, 214-241. Staff. (1989, September 4). Dont B-2 sure. The New Republic, pp. 7-8. Stevens, S. S. (1957). On the psychophysical law. Psychological Review, 64,153-181. Stroop, J. R. (1935). Studies of interference in serial verbal reactions. Journal of Experimental Psychology, IS, 643-662. Sutherland, H. J., Dunn, Y, & Boyd, N. F. (1983). The measurement of values for states of health with linear analog scales. Medical Decision Making, 3,477-487. Tetlock, P E., & Kim, J. I. (1987). Accountability and judgment processes in a personality prediction task. Journal of Personality and Social Psychology, 52, 700-709. Thaler, R. (1985). Mental accounting and consumer choice. Marketing Science, .199-214. Thaler, R. (1986). The psychology and economics conference handbook: Comments on Simon, on Einhorn and Hogarth, and on Tversky and Kahneman. Journal of Business, 59, S279-S284. Thorngate, W (1980). Efficient decision heuristics. Behavioral Science, 25, 219-225. Tversky, A, & Kahneman, D. (1973). Availability: A heuristic for judging frequency and probability. Cognitive Psychology, 5, 207-232. Tversky, A, & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185,1124-1131. Tversky, A., & Kahneman, D. (1981). The framing of decisions and the rationality of choice. Science, 211, 453-458. Tversky, A., & Kahneman, D. (1983). Extensions! versus intuitive reasoning: The conjunction fallacy in probability judgment. Psychological Review 90, 293-315. Wagenaar, W, & Keren, G. B. (1986). Does the expert know? The reliability of predictions and confidence ratings of experts. In E. Hollnagel, G. Mancini, & D. Woods (Eds.), Intelligent decision support in process environments (pp. 87-103). Berlin: Springer-Verlag. Yates, J. F, & Carlson, B. W (1986). Conjunction errors: Evidence for multiple judgment procedures, includingwsigned summation.1' Organizational Behavior and Human Decision Processes, 37, 230-253. Received December 6,1989 Revision received July 10,1990 Accepted February 12,1991

ing decisions. Organizational Behavior and Human Decision Processes, 39, 84-97. Paquette, L., & Kida, T. (1988). The effect of decision strategy and task complexity on decision performance. Organizational Behavior and Human Decision Processes, 41,128-142. Payne, J. W (1982). Contingent decision behavior. Psychological Bulletin, 92, 3&2-402. Payne, J. W, Bettman, J. R, & Johnson, E. J. (1988). Adaptive strategy selection in decision making. Journal of Experimental Psychology: Learning, Memory, and Cognition, 14, 534-552. Petty, R. E., & Cacioppo, J. T. (1984). The effects of involvement on responses to argument quantity and quality: Central and peripheral routes to persuasion. Journal of Personality and Social Psychology, 46,69-81. Phillips, L. D. (1983). A theoretical perspective on heuristics and biases in probabilistic thinking. In P. C. Humphreys, O. Svenson, & A. Vari (Eds.), Analysing and aiding decision processes (pp. 525543). Amsterdam: North-Holland. Posner, M. I., & Snyder, C. R. R. (1975). Attention and cognitive control. In R. L. Solso (Ed.), Information processing and cognition: The Loyola symposium (pp. 55-85). Hillsdale, NJ: Erlbaum. Powell, J. L. (1988). A test of the knew-it-all-along effect in the 1984 presidential and statewide elections. Journal of Applied Social Psychology, 18, 760-773. Ratcliff, R., &McKoon, G. (1981). Automatic and strategic priming in recognition. Journal ojVerbal Learning andVerbal Behavior, 20,204215. Restle, F. (1971). Instructions and the magnitude of an illusion: Cognitive factors in the frame of reference. Perception and Psychophysics, 9, 31-32. Ross, L., Lepper, M. R., Strack, F, & Steinmetz, J. L. (1977). Social explanation and social expectation: Effects of real and hypothetical explanations of subjective likelihood. Journal of Personality and Social Psychology, 35, 817-829. Shaklee, H., & Tucker, D. (1980). A rule analysis of judgments of covariation between events. Memory and Cognition, 8, 459-467. Slobin, D. I. (1971). Psycholinguistics. Glenview, IL: Scott, Foresman. Slovic, P., & Fischhoff, B. (1977). On the psychology of experimental surprises. Journal of Experimental Psychology: Human Perception and Performance, 3, 544-551. Slovic, P., & Lichtenstein, S. (1971). Comparison of Bayesian and regression approaches to the study of information processing in judg-

Вам также может понравиться