Вы находитесь на странице: 1из 13

Journal of Occupational Psychology. 1986,59,81-92.

Printed in Great Britain 1986 The British Psychological Society

Heroism is no substitute for success: Effects of strategy and outcome on perceptions of performance
PATRICK A. KNIGHT* AND FRANK E. SAAL Kansas State University
Two studies were conducted to determine whether performance ratings are more heavily influenced by raters' perceptions of the ratee's ' heroism'that is, consistent behaviour in the face of failureor by raters' perceptions of the ratee's level of success. Eflects of ratee gender were also investigated. Neither a student sample (n= 179) nor a management sample (n= 127) provided evidence in support ofa heroism effect; the effects of strategy (consistent or experimenting) and timing of success or failure (immediate or delayed) were relatively small. Success and failure had the strongest and most consistent effects on performance ratings. These results supported the proposition that raters' perceptions of a ratee's performancelevel of successare the primary determinants of ratings, and that' heroic' perseverance in the face of failure does not result in higher ratings.

Over the years Staw (1981) and his associates have conducted considerable research on the ' escalation of commitment' phenomenon. Escalation of commitment refers to a situation in which a manager or administrator institutes a failing policy, and in response to the failure commits greater and greater time and resources to the policy. Most of Staw's research has focused on 'justification' processes, in vk'hich the escalation of resources committed to the failing policy is viewed as an effort to justify the original decision by ' making' it work (e.g. Staw, 1980). In an effort to identify other potential contributors to escalation of commitment, Staw & Ross (1980) suggested the notion of a 'heroic leadership' stereotype, wherein those who suffer through failure, only to succeed in the end, are perceived as being exceptional managers. Based on this idea, Staw & Ross predicted that managers who eventually succeed by pursuing a consistent strategy in the face of failure will be perceived as more effective than those who experiment or adopt new policies when their original choices fail. Staw & Ross' (1980) definition of heroism has three components: First, the hero must experience failure. Second, in response to this failure, the hero must be consistent, maintaining the behaviour that is apparently responsible for the failure rather than experimenting or otherwise changing the behaviour. Third, the hero must ultimately succeed. If this stereotype actually exists, we might expect managers to ' heroically' maintain ineffective policies, and eschew potentially more effective alternatives, in order to conform to this image of competence. Of course, such behaviour would be contrary to Campbell's (e.g. 1969, 1977) suggestion that managers should adopt an experimenting approach to policy administration. In an experimenting approach, policies are periodically evaluated and, if
*Requests for reprints should be addressed to Patrick A. Knight, Department of Psychology, Bluemont Hall, Kansas State University, Manhattan, KS 66506, USA.

81

82

PATRICK A. KNIGHT AND FRANK E. SAAL

found to be ineffective, replaced with new policies that are subsequently evaluated in the same manner. Staw & Ross (1980) reported data that they claimed supported the existence of a heroism stereotype. They had subjects read scenarios that described an administrator whose initial policy was ineffective, and who was either consistent, sticking with the policy throughout the scenario, or who experimented, changing policies twice. In each strategy condition, half of the subjects read that the ratee was successful in the end, while the other half were told that the ratee remained unsuccessful. Staw & Ross found: (a) that the successful ratee was rated higher than the unsuccessful ratee; (b) that the consistent ratee was rated higher than the experimenting ratee; and (c) that a significant outcome x strategy interaction emergedthe consistent ratee who was ultimately successful was rated higher than all others. This interaction, Staw & Ross argued, supported their view of heroism as a consistent, and eventually successful, response to initial failure. Although Staw & Ross' (1980) findings do correspond to their heroic leadership stereotype. Knight (1984) suggested a more parsimonious explanation for these results. He noted that the main effect favouring the consistent ratee was due largely to the high rating this ratee received in the successful outcome conditionthere was little difference between ratings of consistent and experimenting managers in the failure conditionand that this was the only cell in the study in which the ratee's initially chosen policy achieved success. Knight therefore proposed that the consistent-successful ratee received high ratings not because he was perceived by the subjects as being heroic, but rather because that combination of strategy and success led subjects to perceive him as being more competent than administrators in other conditions, all of whom either failed or chose an effective policy only after selecting two ineffective policies. Knight (1984) concluded that the manipulation of managerial strategy (experimenting versus consistency) is likely to confound any heroism effects with perceptions of overall competence and performance. He therefore argued that a more valid test of Staw & Ross' (1980) heroism stereotype would involve the one component ofthe stereotype that Staw & Ross had not manipulatedthe experience of failure. Knight predicted that if the heroism stereotype existed, then consistent ratees whose success was delayed (i.e. a consistent response to initial failure' heroism') would receive higher performance ratings than equally consistent ratees whose success was immediate, and who therefore could neither have experienced failure nor have demonstrated ' heroic' commitment to their policies or decisions. In the absence of any difference in performance ratings between these two consistent-successful conditions, the impact of perceived heroism, as defined by Staw & Ross, must be questioned. Knight (1984) replicated Staw & Ross' (1980) success condition, comparing ratings of managers who demonstrated either a consistent or an experimenting response to initial failure, but who ultimately succeeded. In addition to these' delayed success' cells. Knight included conditions in which the ratee described in the scenarios was immediately successful and was either consistent, using his initial policy throughout the scenario, or experimenting, changing policies twice while maintaining his successful performance. Knight argued that by comparing the consistent ratees in the immediate and delayed success conditions, the potential benefits of suffering initial failure (heroism) could be tested. Knight's (1984) results revealed no support for the heroism stereotype. Ratings ofthe consistent-delayed success ratee were not significantly different than those of the consistent-immediate success ratee. Both consistent ratees, however, were rated higher than the experimenting-deiayed success ratee. Knight viewed this as corroborating his ' perceived performance' interpretation of Staw & Ross' (1980) data, since the first policies chosen by the consistent ratees proved to be effective, while the experimenting-deiayed success ratee abandoned two failing policies before finally succeeding. In addition, ratees in the

HEROISM IS NO SUBSTITUTE FOR SUCCESS

83

experimenting-immediate success condition received ratings even higher than those in the consistent conditions. Knight interpreted this latter effect as additional evidence in support of the perceived performance explanation, since this ratee, unlike those in other conditions, successfully implemented all three policies.

LIMITATIONS OF PREVIOUS RESEARCH

While Knight's (1984) data suggest that Staw & Ross' (1980) results may have been due to differences in raters' perceptions of the administrator's performance rather than their perceptions of his heroism, this evidence is not conclusive. One problem with Knight's study is that the data are based solely on a sample o(psychology undergraduates; Staw & Ross (1980) did find differences between the responses obtained from psychology students and responses obtained from business students (some of whom were managers attending evening MBA classes). Knight's failure to support the heroism stereotype may have been a function of the population from which his sample was drawn; this stereotype may indeed exist in the business world, and among those students specifically preparing for business careers. One purpose of the current research, therefore, was to compare student and management samples in a design that includes the temporal factor necessary to test the heroism hypothesis adequately. A second problem with Knight's design is that the manager ultimately succeeded in all of his scenarios. The most obvious, and crucial, evidence for the perceived performance explanation is a main effect for outcome, favouring the successful ratee over the ratee who ultimately fails. Although other effects may help identify variables that moderate the influence of success and failure, without a main effect for outcome it would be difficult to conclude that ratings were based on perceptions of performance. In addition to a main effect due to success or failure, there are other effects that can provide support for the ' perceived performance' explanation of ratings. For example, ultimately successful ratees should receive higher ratings if their success is immediate rather than delayed, as found by Knight (1984), since immediate success means that they were successful for a longer period of time than those ratees who experienced delayed success. This ' proportion of time successful' criterion also suggests that ultimately unsuccessful ratees should receive higher ratings if their failure is delayed rather than immediate, since delayed failure implies that the standard for success was met for at least some period of time, while immediate failure demonstrates that the ratee was totally ineffective. It should be noted that the outcome x strategy interaction provides a test of Campbell's (1977) prescription for experimentation in response to policy failure. Neither Staw & Ross' (1980) nor Knight's (1984) subjects rewarded experimenting managers with high ratings. However, if subjects were to view this strategy as effective, ratees who ultimately fail should receive higher ratings if they experiment rather than demonstrate consistency, since by experimenting they are apparently attempting to identify more effective policies, as Campbell advised. Yet another limitation to both Staw & Ross' (1980) and Knight's (1984) studies is that their scenarios described only male managers. Given the popular stereotype of management as a masculine occupation (Schein, 1973, 1975), and considering the number of studies showing that, given equal achievement, men's performance tends to be attributed to ability while the performance of women tends to be attributed to luck (e.g. Deaux & Emswiller, 1974), one might expect sex to moderate the effects of policy strategy on subsequent performance ratings. Since subjective ratings play such an important role in personnel decisions at the managerial level, comparisons between male and female ratees would be useful.

84

PATRICK A. KNIGHT AND FRANK E. SAAL STUDY 1

Method Sample. One hundred and seventy-nine undergraduate psychology students served as subjects in the first study, in partial fulfillment ofa course requirement. The sample was 46 per cent female. Procedure. The research was described to the subjects as an investigation of managerial decision making. Each subject read one of 16 randomly assigned versions of a case study describing an industrial manager and a production problem that the manager had to solve. The scenarios described either a male or a female manager engaging in either a consistent or an experimenting approach to solving the problem. Through periodic feedback reports, it was shown that the manager had been either successful or unsuccessful by the end ofthe scenario, and that this outcome had either occurred immediately (i.e. by the first feedback report) or was delayed (i.e. not obtained until the end of the scenario). The resulting factorial design was a 2 (sex of ratee) x 2 (strategy) x 2 (outcome) x 2 (timing of outcome). After reading the scenario, subjects completed a questionnaire that included measures of their perceptions of the ratee's performance, their perceptions of the ratee's leadership characteristics, and several manipulation checks. Stimulus materials. The scenarios used in this study were, with the addition of two independent variable manipulations, identical to those used by Knight (1984). The scenarios began with a statement stressing the importance of management and management strategy given the uncertain nature of the economy. The statement specifically emphasized the role of middle- and lower-level managers in determining the economic health of businesses. The subjects then read about a middle-level manager named Ziegler who worked for a company that manufactured cast metal products (General Casting, Inc.). The manager was 41 years old, held BA and MBA degrees, and had been working for the company for 11 years, the lastfivein his or her current position. The manager supervised six first-line managers. The scenarios next described a production problem in Ziegler's unit. The percentage of defective castings produced by the unit had increased from 5 3 to 8 7 per cent over the previous year (January 1977 to January 1978). A maximum acceptable defect rate of 6 per cent had been established earlier. In order to help identify the cause ofthe high defect rate, the company had hired a consultant to examine the production process. The consultant outlined three possible solutions: (a) Increase the number of quality control inspectors; {b) Redesign the castings to prevent possible structural causes of the defects; and (c) Use stronger, more expensive metals in the casting process. The consultant said that any of the three solutions might be successful, but could not specify an exact percentage improvement to be expected from any of the options. It was estimated that the plans would be equally costly to implement. The scenario went on to state that in June, 1978, Ziegler and his or her first-line managers discussed the consultant's suggestions and ranked the plans in order of estimated effectiveness. Hiring additional inspectors was ranked first, redesign of the castings was ranked second, and use of new materials was ranked third. Due to economic constraints, it was decided that only one plan could be implemented at a time. It was noted, however, that Ziegler was responsible for deciding whether the company would stick with one plan for an extended period of time, or change plans after shorter periods. Ziegler's superiors had given him or her one year from the implementation of the first policy to reach the 6 per cent standard, so he or she decided to schedule an evaluation ofthe selected policy every four months to allow for possible changes. Ziegler began by instituting the plan to hire additional inspectors. The inspectors began work on 1 September 1978, with a defect rate of 8 5 per cent. Evaluations were thus scheduled for 1 January, 1 May and 1 September 1979.

HEROISM IS NO SUBSTITUTE FOR SUCCESS

85

Manipulations. The sex of the manager/ratee was manipulated by varying Ziegler's first name (Joan versus John), and by using the appropriate sex-specific pronouns in the scenarios and the questionnaire. The outcomes and the timing of the outcomes were manipulated through descriptions of the periodic evaluations. In the success condition, the defect rate fell to below the 6 per cent goal by the end of the year, while in the failure condition the defect rate was near the original 8 5 per cent at the end of the year. In the immediate outcome condition, the ultimate final outcome was reported in the first feedback report and remained the same thereafter, while in the delayed outcome condition, the final outcome did not occur until the third feedback report, and was preceded by the opposite outcome in the earlier reports. Specifically, in the immediate success condition, the defect rate was 6 0 per cent in January, 5 8 per cent in May, and 5-5 per cent in September. In the irhmediate failure condition the rates were 8 4,8 2, and 8-3 per cent. In the delayed success condition the rates in January and May were 8 4 and 8 2 per cent, respectively, but the rate fell to 5 5 per cent by September. Finally, in the delayed failure condition the rates were 6-0, 5-8 and 8-3 per cent. The ratee's strategy was manipulated in the following manner: In the consistent strategy condition, Ziegler decided to continue the use of the additional inspectors after each of the first two evaluations. In the experimenting condition, Ziegler decided after each of the first two reports that the current plan had been given a fair chance, and changed to a new policy to test its effectiveness. Ziegler changed to the new casting designs in January and to the new metals in May. The decision to change or not to change policies was made by Ziegler alone, and not in consultation with his or her subordinates. Further, each decision was presented as being independent of the previous decisions. Since no policy concerning strategy was ever explicitly stated in the scenario, the overall strategy could only be inferred by the subjects from the history of Ziegler's decisions. Dependent measures. The dependent measures used in this study were basically the same as those used by Knight (1984), which were in turn based on those described by Staw 6 Ross (1980). After reading the scenario, subjects completed a questionnaire from which two dependent variables were derived. The first was a performance rating equal to the sum of the following seven-point Likert items:' On the whole, how would you rate Joan [John] Ziegler's performance?' (from 1 = very poor, to 7 = outstanding);' Do you think that Joan [John] Ziegler deserves a large raise in salary?' (from 1 = definitely should not be given, to 7 = definitely should be given); and ' Do you think Joan [John] Ziegler should be considered for a higher position at General Casting?' (from 1 = definitely should not be promoted, to 7 = definitely should be promoted). The mean of this scale was 12 60 and the standard deviation was 4 33. The mean intercorrelation of these items was .0 75 and the reliability of the scale was adequate (Cronbach's a = 0 90). The second dependent variable was a leadership characteristics rating formed by summing the following seven-point Likert items (from 1 = strongly disagree to 7 = strongly agree): 'Joan [John] Ziegler is a very intelligent individual'; 'Joan [John] Ziegler is a careful planner'; 'Joan [John] Ziegler would make a strong leader';' Joan [John] Ziegler had a well developed theory of the casting production problem'; and ' Joan [John] Ziegler is exactly the kind of decision maker we need in industry'. The mean of this scale was 18-14 and the standard deviation was 6 50. The mean intercorrelation of these items was 0 56, and the reliability of the scale was again adequate (a = 0 86). Results Manipulation checks. Ziegler was rated as being more consistent in the consistent condition than in the experimenting condition (F= 102 61, d.f.= 1, 159, P < 0 01, ;;^=0 38) and as having more closely met the 6 per cent standard in the success condition than in the failure condition (F= 1226 38, d.f. = 1, 160, P < 0 01, /;^ = 0 86). The timing manipulation

86

PATRICK A. KNIGHT AND FRANK E. SAAL

Table 1. Outcome x strategy interactions: Study 1


Outcome Strategy Mean performance ratings Consistent Experimenting Total Mean leadership characteristics ratings Consistent Experimenting Total Success Failure Total

15-20" 15-00" 15-10 24-78" 23-33" 24-09

7-96= 12-44" 10-18 15-45' 20-93" 18-16

10-18 13-68 12-60 20-22 22-12 21-14

''Means with common indices are not significantly different. Higher ratings are more favourable.

was checked by ratings of how quickly the production problem was solved. The outcome X timing interaction on this rating was significant (F= 209-74, d.f. = 1, 159, 7'<0-01, ij^ = 0-51) with the expected pattern of means: the highest rating was in the immediate success condition, followed by later failure, later success, and immediate failure conditions.* While it could be argued that immediate success and later failure showed equally quick (i.e. immediate) results, it was expected, and found, that the inability of the latter ratee to maintain success would lower those ratings. Analyses of dependent variables. Because the two dependent variables were correlated (/ = 0-78), a 2 x 2 x 2 x 2 multivariate analysis of variance was conducted. Significant multivariate efiects were obtained for the outcome main efiect (/"= 53-46, d.f. = 2, 162, P<OOl), the strategy main efiect (F=9-60, d.f. = 2, 162, P<OOl), the outcomextiming interaction (F=5-60, d.f. = 2, 162, P<00\) and the outcomexstrategy interaction iF= 12-85, d.f. = 2, 162, />< 0-01). A 2 X 2 X 2 X 2 univariate analysis of variance was conducted on each of the dependent variables to further investigate the efiects detected by the MANOVA. In the analysis of the performance ratings there were two significant main efiects. Subjects who evaluated the success conditions gave higher ratings than those who evaluated the failure conditions (F= 109-34, d.f. = 1, 163, P<0-01, ;7^ = O-33); and subjects who rated experimenting ratees gave higher ratings than those who evaluated consistent ratees (/-= 20-76, d.f. = 1, 163, P<00\, ;/^ = 0-06; see Table 1). These main effects were moderated by a significant outcome x strategy condition interaction (F= 24-27, d.f. = 1, 163, /<0-01, 7^=0-07). Post hoc (Newman-Keuls) tests (Ps<0-05) showed that subjects in the success condition gave the highest ratings, and that strategy made no difference, contrary to the heroism notion. Further, while ratings in the failure condition were all significantly lower than those in the success condition, the;>o.5/ hoc tests revealed that significantly higher ratings were obtained in the experimenting-failure condition than in the consistent-failure condition. This is also inconsistent with Staw & Ross' (1980) results. It is consistent, however, with Campbell's (1977) suggestion that experimenting is the appropriate response to policy failure (see Table 1). In addition,to the effects described above, there was a significant interaction between outcome and timing conditions (F=9-54, d.f. = 1, 163, P<OOl, 7^ = 0-03; see Table 2).
*The outcome x timing interaction, rather than the main effect for timing, is the appropriate check of this manipulation, since the immediate and delayed outcome conditions were orthogonally crossed with the success and failure conditions. Main effects for timing on measures of the timing of either success or failure would therefore reveal no significant differences, even if the manipulations were successful.

HEROISM IS NO SUBSTITUTE FOR SUCCESS

87

Table 2. Outcome x timing interaction on performance ratings: Study 1


Outcome Timing of outcome Immediate Delayed Total Success 15-60' 14-58' 15-10 Failure 9-12 11-12" 10-18 Total 12-43 12-76 1260

'''Means with common indices are not significantly different. Higher ratings are more favourable.

Post hoc (Newman-Keuls) tests showed subjects in the immediate success and delayed success conditions gave significantly higher ratings than those in other conditions, while those in the failure conditions gave higher ratings to the ratee whose failure was delayed than to the ratee whose failure was immediate (Ps < 005). This pattern of means is consistent with the perceived performance explanation, and contrary to predictions based on heroism. No other significant effects emerged from analyses of performance ratings. The analysis of the leadership characteristics ratings yielded three significant main effects. Subjects in the success condition gave higher ratings than subjects in the failure condition (F= 55-78, d.f.= 1, 159, P<OOl, t]^ = 0-22); subjects in the experimenting condition gave higher ratings than subjects in the consistent condition {F=5-90, d.f. = 1, 159, P<005, v^ = 0-02; see Table 1); and subjects in the female ratee condition (mean = 22-01) gave higher ratings than did those in the male ratee condition (mean = 20-24) F=5-96, d.f. = l, 159, P<0Q5, ri^ = 002). The main effect for sex of ratee should be interpreted with caution, since the corresponding multivariate effect was not significant. There was also a significant outcome x strategy condition interaction (F= 18-48, d.f. = 1, 159, P<0 0\, 7^ = 0-07). Post hoc (Newman-Keuls) tests revealed the same pattern of significant differences as we reported for the performance ratings: Subjects in both success conditions gave higher ratings than in either failure condition, while subjects in the experimenting-failure condition gave higher ratings than did subjects in the consistent-failure condition (Ps <0-05; see Table 1). All analyses were conducted a second time with the sex of the student subject/rater included as a factor. There were no significant effects involving subject sex, and the significance of the other effects was unchanged.
Discussion

The results of the first study support the perceived performance explanation of ratings described above. Relatively large main effects for outcome (success vs. failure) on both dependent variables and the outcome x timing interaction on the performance ratings were obtained. Further, the outcome x strategy interaction, predicted on the basis of Campbell's (1977) advice to experiment in response to policy failure, was found for both variables. Contrary to the notion ofa stereotype o f heroism' proposed by Staw & Ross (1980; Staw, 1981), there was no significant difference between the consistent-delayed success condition and the consistent-immediate success condition on either variable. As noted earlier, however, this result suggests that there is no general stereotype of' managerial heroism' among psychology students; it does not imply that there may not be such a stereotype among practicing managers. The second study was conducted to test the generalizability of these results to a management population.

PATRICK A. KNIGHT AND FRANK E. SAAL STUDY 2

Method Subjects. One hundred and twenty-seven managers served as subjects in the second study. Four of the managers were employees of a large Texas city; the rest were employed by a wide variety of organizations in the state of Kansas. This group included employees of manufacturing firms, insurance companies, banks, a state university, and civilian employees of the US Army. Participation of the managers was solicited either by contacting upper-level personnel managers and asking for their cooperation in distributing the materials to managers and supervisors in their organizations, or during management development seminars conducted by the second author. To be included in the management sample a person had to supervise at least one other person's work. One hundred and ninety-one managers were asked to participate, and the final sample represents a response rate of 67 per cent. Ninety-two (72 per cent) of the managers reported their sex, and of these, 34 (37 per cent) were women. Of the 35 managers not reporting their sex, 20 were from a single manufacturing company. Procedure. The scenarios and the questionnaire were distributed to the managers in sealed packages with a cover letter that included an informed consent statement and a guarantee of anonymity. Data from the managers who completed the exercise after the management development seminars were collected immediately following the seminar sessions. The other responses were gathered by the personnel administrators in the various organizations, and were either mailed to the researchers or picked up by the researchers in person. Stimulus materials and measures. The scenarios, manipulations, and questionnaire were identical to those used in the first study. In this sample the performance rating scale had a mean of 9-44 and a standard deviation of 3-92. The mean intercorrelation of the three items was 0-74 and the reliability was adequate (a = 0-89). The leadership characteristics scale had a mean of 16-25 and a standard deviation of 7-11. Thefiveitems in this scale had a mean intercorrelation of 0-62, and the reliability was again adequate (a = 0-89). Results Manipulation checks. Subjects in the consistent condition perceived greater consistency than did those in the experimenting condition (F= 23-67, d.f.= l. 111, P<0 0\, ;;^ = 0-16); subjects in the success condition agreed more strongly than did subjects in the failure condition that Ziegler had met the 6 per cent standard (F= 209-79, d.f. = 1, 111, /'<0-01, ;/^=0-61); and the predicted interaction between outcome and timing conditions on the timing manipulation check was also obtained {F= 79-84, d.f = 1 l l l 0 0 ^ Analyses of dependent variables. As in the first study, the dependent variables were highly correlated (r = 0-82), and a multivariate analysis of variance was performed. Significant multivariate effects were obtained for the outcome main effect (F= 12-79, d.f. = 2, 110, ^ < 0-01), the outcome x strategy interaction (F=3-73, d.f. = 2,110, i'<0-05), and the outcomextiming interaction (/'=3-19, d.f. = 2, 110, P<0 05). A 2 x 2 x 2 x 2 univariate analysis of variance on the performance rating scale data revealed two significant effects. First, there was a main effect for the outcome condition, with subjects who evaluated success conditions giving higher ratings than subjects who evaluated failure conditions (F=25-83, d.f.= l. 111, P<00\, rj^^OW, see Table 3). Second, the outcome x strategy interaction was significant (F=7-18, d.f. = l. 111, P<OOl, rj^ = OOA). Post hoc tests

HEROISM IS NO SUBSTITUTE FOR SUCCESS

89

Table 3. Outcome x strategy interactions: Study 2


Outcome Strategy Mean performance ratings Consistent Experimenting Total Mean leadership characteristics ratings Consistent Experimenting Total Success Failure Total

12-71" 10-00' 11-08 21 -92' 16-69" 18-78

7-56' 8-45'" 7-97 13-89" M-IO' 13-99

9-62 9-28 9-44 17-10 15-49 16-25

'"iVIeans with common indices are not significantly cJifferent. Higher ratings are more favourable.

(Newman-Keuls) showed that the consistent-successful ratee received higher ratings than all the other conditions, while the experimenting-successful ratee received higher ratings than the consistent-failing ratee (/'s<0-05; see Table 3). The analysis of variance on the leadership characteristics ratings showed that subjects in the success condition gave higher ratings than those in the failure condition {F= 18-42, d.f. = 1, 111, /* < 0-01, //^ = 0-12; see Table 3). The outcome x strategy interaction was also significant (F=4-88, d.f.= l. 111, P<0 05, >i^ = 0-03). Post hoc tests (Newman-Keuls) revealed that the consistent-successful ratee received higher ratings than ratees in any other condition {P<0 05; see Table 3).* The univariate outcome x timing effect was not significant for either dependent variable.
Discussion

The second study provides further insights into the effects of outcomes, strategy, and timing of outcomes on performance and leadership ratings. With respect to the heroism stereotype, the managers' data revealed a preference for the consistent ratee in the success condition, as predicted by Staw & Ross (1980). As argued by Knight (1984), however, Staw & Ross' description of heroism implies that the crucial test of the stereotype is in the consistent-successful condition, where the ratee whose success is delayed should receive the highest ratings. There was, however, no significant difference between the consistentdelayed success condition and the consistent-immediate success condition. In sum, there is no evidence for a stereotype of heroism, as defined by Staw & Ross, in these data. This study does yield additional support for the perceived performance explanation. The main effects for outcome favouring the successful ratee were significant, as predicted. Also, in the absence of a significant difference between the consistent-delayed success and consistent-immediate success conditions, the outcome x strategy interaction, with significantly higher ratings for the consistent-successful ratee than for the experimentingsuccessful ratee, is consistent with Knight's (1984) results supporting the perceived performance explanation.
Because of the large proportion (33 per cent) of subjects in the second study who did not report their sex, and the large difference between the number of known male and female subjects, it was not possible to include subject sex as a factor in these analyses. Further, the number of subjects of each sex in the cells needed to test the heroism hypothesis (consistent-delayed success and consistent-immediate success) was too small to conduct meaningful separate analyses for each.

90

PATRICK A. KNIGHT AND FRANK E. SAAL

Compared to Study 1, the results from the management sample provided no evidence in support of Campbell's (1977) advice to experiment in the face of failure: In none of the analyses was an experimenting condition rated higher than the corresponding consistent condition.

CONCLUSIONS

Before discussing the implications of these studies, a potential limitation of this research should be acknowledged. Because the subjects in these studies rated imaginary ratees and based their ratings on necessarily limited information, the generalizability of the results to 'real world' problems may be questioned. Although generalization of results, whether from laboratory or field research, should always proceed with caution, we believe that this sort o f paper people' task is not without its counterpart in organizations. For example, as faculty members we make decisions about accepting students into our graduate programme. Those students, at the time the decisions are made, are literally ' paper people'. When our department recruits faculty, the vast majority of the applicants are rejected on the basis of how they look on paper. Similar personnel decisions are made, we believe, by most employing organizations. Ilgen & Favero (1985) recently argued that paper people research may have some value in studying the evaluation of resumes, but that in the study of performance appraisal the technique lacks necessary realism. Applied to the research presented here, Ilgen & Favero's arguments imply that our effects were in part determined by the fact that only a single aspect of performance (product quality of the unit) was manipulated, relative to specific goals. The salience of this manipulation in the absence of other performance cues may have strengthened the observed outcome effects, although the implications for effects involving timing and strategy are less clear. Although Ilgen & Favero (1985) raise legitimate questions about this type of research, we believe that their distinction between evaluating applications and evaluating job performance is, to a large extent, artificial. It is true that there is usually more personal interaction between rater and ratee, and more first-hand knowledge about a ratee's actual behaviour in a performance appraisal task. The evaluation of a ' paper' person, however, whether a job applicant or a worker with whom contact is indirect or limited (e.g. sales representatives, whose evaluations may be influenced primarily by sales volume figures), involves many of the same perceptual and cognitive processes as ' normal' performance appraisal (cf. Landy & Farr, 1980). Paper people research may therefore have wider relevance than Ilgen & Favero have argued. Turning to the current hypotheses, the results of these studies offer no support for the heroism stereotype as defined by Staw & Ross (1980). Although there were some effects that are consistent with the heroism notion, these results, like those Staw & Ross originally offered in support of the stereotype, are more easily explained by alternative interpretations of the effects. Greater support was obtained for the position that ratings are determined primarily by raters' perceptions of ratee performance. In both studies, outcome (success versus failure) had the largest effects, accounting for between 12 and 33 per cent of the variance in the ratings. This is encouraging, since performance ratings are typically used in lieu of ' harder' criteria, and should thus correspond closely to objective performance. In most analyses outcome interacted with strategy or timing. With the exception of the outcome x strategy interactions in the first study, which supported Campbell's (1977) experimenting approach, the interactions that yielded significant post hoc differences between cell means were consistent with the predicted effects of strategy and timing on perceived performance. In general, however, strategy and timing interactions had little

HEROISM IS NO SUBSTITUTE FOR SUCCESS

91

practical effect, accounting for between 2 and 7 per cent of the variance in the ratings. Similar effect sizes were found by Staw & Ross (1980).* It therefore seems that although factors such as policy consistency and the timing of success or failure can influence perceptions of performance and subsequent ratings, these effects are likely to be small and of limited practical importance. It would have been desirable to compare the results of the two studies by combining the two data sets into a single analysis. When this is done, however, a 2.4:1 ratio in cell sizes and a 15:1 ratio in cell variances result. (The unequal cell sizes violate the assumptions of the standard homogeneity of variance tests.) This led us to the conclusion that a combined analysis was not defensible. Where the two studies differ in results, however, the pattern of means clearly indicate the nature of the differences. An important comparison between the two studies concerns the relative support for Campbell's (1977) prescription to experiment. The students showed a preference for the experimenting ratee in the failure condition, which is consistent with Campbell's recommendations. They also had an overall main effect favouring experimenting. Experimentation, however, was not accepted by the managers. The students also gave higher ratings in the female ratee conditions, whereas no main effects for this variable were found in the managers' data. While it is tempting to speculate about the reasons for these differences, the size of these effects, as mentioned above, limits their practical importance. Another notable difference between the student and management samples is that the managers gave uniformly lower ratings on both scales than did the students. Also, the proportion of variance accounted for in both manipulation checks and analyses of the dependent measures was lower for the managers than for the students. We suspect that this is due to the effects of the managers' practical experience, and their willingness to anticipate the effects of variables not described in the scenario. For example, both the students and managers apparently based their ratings of the successful ratee primarily on the success of the unit. The managers, however, may have readily recognized that factors beyond the control of the ratee could affect performance, and therefore assigned less credit to his or her actions. Clearly, future experimental tests of the effects of success and strategy on ratings would benefit from more complex and realistic stimulus materials, such as videotapes of managers interacting with employees (Ilgen & Favero, 1985). In conclusion, these studies found no support for a heroism stereotype, as defined by Staw & Ross (1980). Third does not necessarily mean that there are no ' heroes' in organizations, but it does suggest that consistency plays a small role in their emergence. As far as we know, any data supporting such a stereotype have confounded consistency with evidence of task success. Perhaps it would be more fruitful to examine systematically the effects of various task characteristics on the emergence of heroism. For example, Staw & Ross' (1980) scenarios described a public housing administrator who received feedback on his efforts to raise consumer satisfaction with housing over a number of years. It may be that perceptions of the efficacy of consistency or experimenting are moderated by the type of problem a manager faces, and the resulting time frame within which he or she must solve the problem. Perhaps ' heroism' results only from years of dedicated consistency, while ' wishy-washy' leadership is inferred only when long-standing policies are abandoned. If so, this would imply that managerial heroism is a phenomenon more likely to emerge in very long-cycle, non-industrial organizations. In any case, future research on heroism should go beyond simple examinations of managerial consistency, and consider such situationai factors.
Although Staw & Ross (1980) did not report effect sizes, it is possible to compute partial eta-squares (7^-part) from the data they did report. For Staw & Ross' success/failure main effect, ;7^-part = O-23, for their strategy main effect, ;;^-part = 006, and for their strategyxoutcome interaction, ;^-part = 002. It should be noted that ^^-part is an overestimate of the normal ri^ (Maxwell et al, 1981).

92

PATRICK A. KNIGHT AND FRANK E. SAAL ACKNOWLEDGEMENTS

The authors wish to thank two anonymous reviewers for their helpful comments.
REFERENCES CAMPBELL, D . T . (1969). Reforms as experiments. American Psychologist, 24,409-429. CAMPBELL, D . T . (1977). Keeping the data honest in the experimenting society. In H. W. Melton & D. J. Watson (eds). Interdisciplinary Dimensions of Accounting for Social Goats and Social Organizations. Columbus, OH: Grid. DEAUX, K . & EMSWILLER, T . (1974). Explanations of successful performance on sex-linked tasks: What's skill for the male is luck for the female. Journal of Personality and Social Psychology, 29, 80-85. ILGEN, D . R . & FAVERO, J. L. (1985). Limits in generalization from psychological research to performance appraisal processes. Academy of Management Review, 10,311-321. KNIGHT, P. A. (1984). Heroism versus competence: Competing explanations for the effects of experimenting and consistent management. Organizational Behavior and Human Performance, 33, 307-322. LANDY, F . J. & FARR, J. L. (1980). Performance rating. Psychological Bulletin, 87,72-107. MAXWELL, S. E., CAMP, C . J. & ARVEY, R . D . (1981). Measures of association: A comparative examination. Journal of Applied Psychology, 66,525-534. SCHEIN, V. E. (1973). The relationship between sex role stereotypes and requisite management characteristics. Journal of Applied Psychology, 57,44-48. SCHEIN, V. E. (1975). Relationship between sex role stereotypes and requisite management characteristics among female managers. Journal of Applied Psychology, 60,340-344. STAW, B . M . (1980). Rationality and justification in organizational life. In B. M. Staw & L. L. Cummings (eds). Research in Organizational Behavior, vol. 2, pp. 45-80. Greenwich, CT: JAI Press. STAW, B. M . (1981). The escalation of commitment to a chosen course of action. Academy of Management Review, 6, 577-587. STAW, B. M . & Ross, J. (1980). Commitment in an experimenting society: A study of the attribution of leadership from administrative scenarios. Journal of Applied Psychology, 65,249-260. Received 12 November 1985; revised version received 27 February 1986

Вам также может понравиться