Вы находитесь на странице: 1из 8

The effectiveness of risk management: measuring

what didn't happen

John F. McGrew
Pacific Bell, San Ramon, California, USA
John G. Bilotta
Charles Schwab & Co., Inc., San Francisco, California, USA

Keywords the benefits; and that the effort is unlikely to


Risk management, Measurement, Introduction uncover anything that is not already well
Project management
Development of software is an inherently known to everyone involved in the project.
Abstract risky business. The probability of schedule Nearly every project risk is known to at least
Two of the most common reasons slippage on large software engineering one person on the team. Success results from
for not implementing a risk assuring that someone owns the
projects is nearly 100 per cent (Jones, 1994).
management program are cost
and benefit. This paper focuses The probability of cost overruns exceeds 50 responsibility for addressing each risk. The
on whether the benefits of per cent, while the probability of outright issue is to identify, document, and manage
intervention can be shown to failure is about 10 per cent (Jones, 1994). One risks ± not just to know about them. This
justify the costs. A confounding paper is focused on the first two concerns
in ten large system efforts in the USA is
factor is that the acts of
never finished or delivered (Putnam and which come down to whether the benefits of
intervention during a risk
management program may alter Myers, 1992). Any effort on the scale of risk management justify the costs ± a
the outcome in ways we cannot software development involving significant question that can only be answered if we can
separate and therefore cannot measure the effectiveness of a risk
risk factors requires that risks be actively
cost out. A second confounding
managed if the effort is to succeed. management program.
factor is response bias ± the
tendency of individuals Project risk management is the overall It is difficult to argue against the objection
consistently to underestimate or process of analyzing and addressing risk. The that the benefits of risk management cannot
overestimate risk, resulting in be demonstrated. If a team assesses the risks
process has three major components:
interventions that may be
assessment, mitigation, and contingency associated with a project and intervenes to
ineffective or excessively
wasteful. The authors planning (Charette, 1989). In the assessment minimize them and the project succeeds, was
demonstrate that signal detection phase, risks are analyzed for likelihood and the intervention program successful ± or
theory (SDT) can be used to
impact on project objectives. The mitigation would the project have succeeded anyway?
analyze data collected during a
phase generates action plans to minimize the The act of intervention may have altered the
risk management program to
disambiguate the confounding risks. As the project progresses, the outcome in ways we cannot separate out ± or
effects of intervention and effectiveness of the mitigation plans is it may not have altered it at all. The act of
response bias. SDT can produce
reviewed and adjustments are made as intervention confounds any effort to quantify
an unbiased estimate of percent
necessary. Finally, contingency plans are the effectiveness of a risk management
correct for a risk management
program. Furthermore, this developed to offset the consequences of program.
unbiased estimator allows failure should the risk mitigation plans fail A second confounding factor is response
comparison of results from one
(Bilotta, 1995). An effective program of risk bias. We can best understand the impact of
program to another. response bias by examining two extreme
management is an ongoing process of
assessment, intervention and fallback hypothetical cases: Projects A and B. The
planning (Boehm, 1989). Yet the managers of Projects A and B are both
implementation of risk management anxious to minimize their risks. They each
programs on software development projects hold risk assessment and planning sessions
is rare in commercial business (Jones, 1994). to identify risks and to create risk mitigation
The reasons for not implementing risk and contingency plans.
management include cost, benefits, and The team members of Project A are very
expertise. The most common rationalizations confident about their ability to deliver on all
are that the project is too small (or too large) tasks, they see little or no risks associated
to justify the time and expense of a review; with any of their commitments. Rather than
that the benefits cannot be determined and, assessing each activity objectively by
therefore, the costs are assumed to outweigh reference to its actual risks, the team is nay-
Management Decision saying, that is, minimizing the assessment of
38/4 [2000] 293±300 risk across the board. Consequently, their
The current issue and full text archive of this journal is available at
# MCB University Press analysis generates few, if any, mitigation or
[ISSN 0025-1747] http://www.emerald-library.com
contingency plans. The result is that, as
[ 293 ]
John F. McGrew and problems do arise, the impacts are greater releases. The measure of effectiveness is
John G. Bilotta than expected and the team is unprepared to based on an analysis that contrasts the
The effectiveness of risk deal with them. In contrast, the team results of the risk assessment, mitigation,
management: measuring what
didn't happen members of Project B see significant and contingency planning efforts with the
Management Decision problems hidden in every task, even the actual outcome of the project in terms of the
38/4 [2000] 293±300 simplest routine work. They err on the side tasks that failed and the tasks that did not.
of yea-saying. In their minds, every task is There are two prerequisites for such an
overflowing with risk that must be analysis. The first is that the risk assessment,
controlled. As a result, their analysis mitigation, and contingency data be tracked.
generates dozens or hundreds of action items The second is that an independent audit or
for staff members to implement at post-implementation review of the project be
considerable cost in terms of people and conducted in a way that allows the actual
resources with, proportionately, little in the outcome of each task (in terms of its degree of
way of payback. failure) to be assessed. This information can
Assuming that both projects have then be organized into two contingency
moderate, and roughly equal, amounts of tables (examples with hypothetical data for
risk, it is not too difficult to imagine their an imaginary project are provided by Tables
respective outcomes. Project A demonstrated I and II).
a strong negative response bias by The rows of Table I separate the tasks into
consistently underestimating the risk two groups: those which the team estimated
associated with the project and, to be of high risk and those which it
consequently, underpreparing for the estimated to be of low risk. The rows of Table
consequences. Project B, on the other hand, II also separate the tasks into two groups:
demonstrated a strong positive response bias those which the team judged to require
by overestimating the risks and, therefore, intervention and those which required no
overpreparing for the consequences. intervention. Similarly, the columns in both
What will the outcome be? Project A, which Tables I and II separate the tasks into two
underestimated the risks, will experience groups: those found, after completion of the
significant problems in delivering its product project, to have failed and those found not to
and will dismiss risk assessment as have failed.
ineffective because it did not prevent the If we look at the second column of each
problems from occurring. Project B, which table (Tables I and II), we can see the nature
overestimated the risk, will likely avoid most of the analytical problem facing us. At the
of the real problems in delivering its product end of the project, 36 of the tasks were
and all of the imaginary ones as well. It is classified as having been completed
likely that it too will dismiss the value of risk successfully, that is, they did not fail. Of
management since it was so costly in terms of
people and resources yet the project was Table I
successful anyway. Contingency table for the outcome of a
Ideally, we want to optimize the value of a hypothetical risk assessment program
risk management program. To achieve that
goal, we must ask whether it is possible,
Tasks Tasks
that that did Row
based on an evaluation of a team's risk
failed not fail totals
management efforts relative to the actual
outcome of the project, to determine just how Tasks estimated 21 4 25
skillful the team was in correctly identifying to be high risk
significant risks, dismissing insignificant Tasks estimated 9 32 41
risks, and intervening to head off risks. In to be low risk
short, can we, from the data, determine a Column totals 30 36 66
team's skill in risk assessment and risk
mitigation while controlling for response
bias? Can we provide an unbiased estimate of Table II
the effectiveness of risk management? Put Contingency table for the outcome of a
another way, can we measure what we may hypothetical risk intervention program
have caused to not happen? Tasks Tasks
This paper discusses techniques for that that did Row
measuring and understanding the failed not fail totals
effectiveness of risk management programs
while controlling for response bias. Our Tasks intervened 19 12 31
research data focus on risk programs that Tasks not 11 24 35
were implemented for two software intervened
development teams over several product Column totals 30 36 66

[ 294 ]
John F. McGrew and those, 32 were judged to be of low risk (cf., the
John G. Bilotta lower right hand cell of Table I) so no Applying signal detection theory to
The effectiveness of risk intervention was planned for 24 of them (cf., the problem
management: measuring what
didn't happen the same cell of Table II). On the other hand, Signal detection theory (SDT) was developed
Management Decision the team did intervene in the four tasks in the communications industry by Abraham
38/4 [2000] 293±300 judged to be of high risk, taking steps, either Wald (Swets, 1996) to provide a tool for
through risk mitigation or contingency determining whether a signal could be
planning, to minimize the risk. The core accurately separated from background noise.
analytical problem is that the team The technique was adapted by Swets and
intervened on behalf of four tasks based on Tanner (Swets, 1996) for use in psychophysics
its estimate of the risk inherent in each task. to separate a subject's skill from the response
Consequently, it is not possible to say how bias when detecting a signal. Since that time,
many of those four tasks succeeded because SDT has been used extensively to describe
the team intervened as opposed to the the interaction of a person's sensitivity and
number that would have succeeded even if bias in perceptual, auditory, and other
the team had not intervened. After all, it also sensory tasks (Macmillan and Creelman,
intervened on eight other low risk tasks that 1991; Swets, 1996). It has also been extended
were successful (the team intervened in a beyond sensory discrimination tasks to
total of 12 tasks as shown in Table II, but only describe the interaction between a person's
four of the 12 tasks were assessed to be of understanding of the world and the world as
high risk in Table I). This is an instance of a it actually exists (McGrew and Jones, 1976;
positive response bias, that is, seeing McGrew, 1983, 1994). Used in this way, it
problems where they don't exist. provides a method to separate an individual's
In the same way, if we look at the first ability to understand the realities of a
column of each table, we see that the team situation from his/her biases about the
estimated 21 of the 30 tasks that failed to be of situation. In the context of risk management,
high risk but intervened on only 19 ± yet all it can separate a team's ability to accurately
30 tasks failed. Is this a skill problem on the detect and intervene in risks from its
part of the team in intervening on the correct tendency to nay-say or yea-say.
tasks, or is it an example of a negative The advantage of SDT over traditional
response bias? If an intervened task fails, did inferential and descriptive statistics is that,
the intervention prevent a worse failure or instead of comparing a sample distribution to
did it actually cause the failure? If the task a theoretical distribution, it compares two
succeeds, did the intervention cause the sample distributions directly. The ability to
success or would it have succeeded anyway? work directly with sample distributions
In other words, we need a method to makes SDT an ideal tool for assessing the
disambiguate the interaction of risk and success or failure of a risk management
intervention. We need a tool to extract from program because there is no theoretical
the risk assessment and intervention distribution for risk. The comparison of
contingency tables any bias toward yea- sample distributions is configured as a
saying or nay-saying if we are accurately to contingency table. The contingency table
estimate the effectiveness of the overall risk contrasts the perception of the world with the
management program. world as it really is. Table III shows the
At the end of any risk management contingency table layout for a typical SDT
program, a project team will be unable to say analysis. In Table III, A and B represent the
much about its effectiveness. It will know column marginals which sum to 1; C and D
how many tasks failed and how many did not represent the row marginals which do not
± they will have some estimate of its overall necessarily sum to 1.
success rate ± but it will not know whether its The row labels in Table III (event reported
intervention had an impact on the tasks that and event not reported) represent a
succeeded or whether their success or failure perception of the world. The column labels
was owing to an overly positive or negative
response bias. This paper will show that, Table III
using the tools of signal detection analysis, Contingency table for a typical signal
the skill of the team, that is, the effectiveness detection analysis
of the team's intervention program in Event Event did Row
mitigating and controlling risk, can be occurred not occur totals
estimated while controlling for response
Event reported Hit False alarm C=1
bias. Understanding response bias provides
Event not Incorrect Correct D=1
the means to assess, and through repeated
reported Rejection Rejection
use, to optimize the value of a risk
Column totals A=1 B=1
management program.
[ 295 ]
John F. McGrew and (event occurred and event did not occur) for its analysis. One can just as appropriately
John G. Bilotta represent the real world. In the case of a risk use the correct and incorrect rejections in the
The effectiveness of risk management program, using Table II as an bottom row to perform the analysis owing to
management: measuring what
didn't happen example, we can interpret row ``task the symmetry of the theory. The structure of
Management Decision intervened'' as event reported and ``task not an SDT contingency table is such that we can
38/4 [2000] 293±300 intervened'' as event not reported. We can, infer the correct values for the top row of the
also in Table II, interpret the column ``task table from the bottom row by virtue of the
failed'' as event occurred and ``task did not fact that the columns must sum to 1. In this
fail'' as event did not occur. The row and analysis, we use the incorrect and correct
column headings of Table I can be rejection rates to determine the skill and bias
interpreted in the same way. of the team because they provide a direct and
The upper lefthand cell of Table III (labeled unambiguous measure of the effectiveness of
``hit'') represents the correct perceptions of the team's risk management program. The
true events (event reported and event outcome would be the same if we used the
occurred). The lower righthand cell of Table inferred values. However, the inferred values
III (labeled ``correct rejections'') represents are confounded with the actions of the team
the correct perceptions of false events (event in an unknowable way.
not reported and event did not occur). SDT provides a model for understanding
Together these two cells represent the only the contingency table summarizing the
situations in which the team has correctly outcome of a risk management program in
discerned the nature of the events it has spite of the analytical complications
examined or in which it has intervened. That introduced by the team's acts of intervention.
is, they have correctly called true events true An SDT analysis of the contingency tables for
and false events false. In the context of risk risk assessment and risk intervention allows
management, we can say the team has us to determine three critical metrics about
correctly identified high risks that require the effectiveness of a risk management
intervention and low risks that do not. program: it provides an estimate of the
In contrast, the other diagonal of Table III team's skill, of its response bias, and an
represents those items the team has unbiased estimate of the team's percent-
incorrectly perceived. That is, these are correct, an easily understood indicator of
events which are false but which the team effectiveness. We could, alternatively, take
has judged to be true (false alarms ± upper the very simple approach of adding up the
righthand cell) or events which are true and hits and correct rejections (all of the team's
which the team has judged to be false correct responses) and dividing by the total
(incorrect rejections ± lower lefthand cell). number of tasks to determine a simple
Owing to the confounding effect of percent-correct metric from the contingency
intervention, however, the only two cells of table. The straight percent-correct metric,
Table III that offer the possibility of an however, is easily distorted by response bias
unambiguous interpretation fall along the which can cause a gross over or
bottom row (the incorrect rejections and the underestimate of the true percent correct. In
correct rejections). In the top row, the truth our empirical data, we will show just such a
of a task (that is, whether it failed) is partly large effect for one of the project teams.
due to the actions of the risk management
team. If you intervene to mitigate a risk and
it does occurs (that is, the task fails), you
Method
don't know whether it would have failed
anyway or if the intervention increased or The data presented here were collected from
decreased the degree of failure. The same two different project teams across three
problem exists for an intervention for which product releases. The first team, Project T,
the risk does not occur (that is, the task does implemented a risk management program
not fail). You don't know whether the during the final months of the project's
intervention succeeded or if the risk simply overall schedule. The team consisted of
did not materialize. Interpretation of the top almost 200 management and technical
row is confounded by the act of intervention professionals working over a period of five
and the human propensity toward response years (Bilotta, 1995). The project management
bias. However, in the bottom row, it is clear team requested a risk program be set up to
when a risk occurs and you do not intervene address concerns about the readiness of the
(incorrect rejections) and it is also clear project. This was the team's first attempt at
when you do not intervene and a risk does formal risk management.
not occur (correct rejection). Fifty-two tasks remaining in the project
Traditionally, SDT uses the hit and false schedule were reviewed once a month for six
alarm rates (the top row of Tables I, II and III) months by the key technical staff under the
[ 296 ]
John F. McGrew and direction of one of the authors. Likeliness of Table V
John G. Bilotta failure and impact of failure were estimated Risk intervention contingency table for
The effectiveness of risk by the team for each task and were used to
management: measuring what Project T
didn't happen assess risk according to a process that was
first documented by the Air Force (US Air Tasks Tasks
Management Decision
38/4 [2000] 293±300 Force, 1988). Separate sessions were held to that that did Row
develop action plans to mitigate identified failed not fail totals
risks or to develop contingency plans to Tasks intervened 16 15 31
minimize the impact of failure. Tasks not
After the application had been successfully intervened 9 12 21
implemented, a quality assurance team from Column totals 25 27 52
outside the project performed a post-
conversion review, covering all aspects of the Table VI
development effort. They provided an Risk assessment contingency table for
independent assessment of the root causes of Project N, Phase I
success and failure that could be linked back
Tasks Tasks
to project activities, including the 52 tasks
that that did Row
monitored under the risk management
failed not fail totals
program (Bilotta, 1995). This, plus the
original risk assessment, mitigation, and Tasks estimated
contingency planning data provided the to be high risk 3 5 8
material to construct risk assessment and Tasks estimated
risk intervention contingency tables for the to be low risk 1 6 7
team. Column totals 4 11 15
The same methodology was used to
establish a risk management program for a
second project, Project N, being developed in Table VII
multiple phases by a small team of a dozen Risk intervention contingency table for
technical staff. Risk assessments, facilitated Project N, Phase I
by the authors, were done once for each of Tasks Tasks
two separate project phases. Again, follow up that that did Row
assessments of success and failure for failed not fail totals
individual tasks were conducted by a quality
Tasks intervened 3 7 10
assurance manager.
Tasks not
To begin the SDT analysis for risk
intervened 1 4 5
assessment, the team's assessment of risk for
Column totals 4 11 15
each task (high or low) was matched with the
task's corresponding outcome (failed or did
not fail) ± see Tables IV, VI and VIII. The Table VIII
same procedure was followed to analyze risk Risk assessment contingency table for
intervention: the team's action relative to Project N, Phase II
each task (intervened or did not intervene) Tasks Tasks
was matched with the task's outcome (again, that that did Row
failed or did not fail) see Tables V, VII, and failed not fail totals
IX. In this way, each task was placed in one of
four categories corresponding to the four Tasks estimated
cells of the contingency table: correct to be high risk 1 0 1
rejections, incorrect rejections, hits, and Tasks estimated
false alarms. to be low risk 0 27 27
Column totals 1 27 28
Table IV
Risk assessment contingency table for Table IX
Project T Risk intervention contingency table for
Project N, Phase II
Tasks Tasks
that that did Row Tasks Tasks
failed not fail totals that that did Row
failed not fail totals
Tasks estimated
to be high risk 18 3 21 Tasks intervened 0 11 11
Tasks estimated Tasks not
to be low risk 7 24 31 intervened 1 16 17
Column totals 25 27 52 Column totals 1 27 28

[ 297 ]
John F. McGrew and A correct rejection is a task which did not where it doesn't exist. The range of c runs
John G. Bilotta fail and which, during assessment, was from 0 to any value, positive or negative,
The effectiveness of risk judged to be of low risk or which, during although it is unusual for it to be beyond ‹2.
management: measuring what
didn't happen intervention, was not acted upon. An It is obvious that the closer c is to 0, the more
Management Decision incorrect rejection is a task which did fail efficient the risk management program
38/4 [2000] 293±300 during implementation but which, during because response bias is not leading the team
assessment, was judged to be of low risk or toward underestimating or overestimating
which, during intervention, was not acted the risk. When c = 0, risk intervention
upon. The upper lefthand cell of the upper actions are taken when required, but only
row contains the risk estimates and when required. Both d' and c are based on a
interventions that were successful (hits) comparison of the hit and false alarm rates
while the upper righthand cell contains the or, alternatively, the correct rejections and
risk estimates and interventions that were the incorrect rejections, as in our case.
not (false alarms). The reciprocal Other measures can also be derived from
relationship between the cells in the upper an SDT analysis including p(c), the raw (but
and lower rows is made clear in Equations 1 biased) percent correct described earlier and
and 2 where H stands for hits, FA for false p(c)unb, an unbiased estimate of the percent
alarms, IR for incorrect rejections, and CR correct. The statistic p(c) is the sum of the
for correct rejections. A is the column total hits and the correct rejections divided by the
for ``event occurred'' (task failed) and B is the total number of tasks in all the cells. p(c)
column total for ``event did not occur'' (task
yields a biased estimate of the true percent-
did not fail). The relationships in Equations 1
correct because the underlying sample
and 2 are what enable us to estimate the
distributions may be skewed. The statistic
effect of events that did not happen.
p(c)unb yields an unbiased estimate of the true
1 = (H/A) + (IR/A) (1) percent-correct and is calculated from d'
1 = (FA/B) + (CR/B) (2) which is a measure of skill that is
independent of bias.
The primary measures derived from a signal Equations 3, 4, 5, and 6 are used to calculate
detection analysis are d', which is a measure d', c, p(c), and p(c)unb respectively.
of the observer's sensitivity or skill, and c
which is a measure of the observer's d' = z(H/A) ± z(FA/B) (3)
response bias. The statistic d' is the distance, c = ±0.5[z(H/A) + z(FA/B)] (4)
in standard deviation units, between the
p(c) = (H + CR)/(A + B) (5)
mean of the distribution of tasks that failed
and the mean of the distribution of tasks that p(c)unb = d'/2) (6)
did not. d' can be interpreted in the same way In equations (3) and (4), z is the standard
as differences between standard normal normal variate expressed in standard
scores (or z-scores). A d' of 0 indicates a deviations. In equation (5),  is the standard
complete overlap of the sample distributions. normal cumulative probability.
A d' of 1.65 indicates a distance between the
means of the sample distributions of 1.65
standard normal scores or about a 5 per cent Results
overlap of the distributions, while a d' of 3
indicates an overlap of no more than 0.3 per The risk management program for Project T
cent. focused on the 52 most critical tasks
The statistic c is the acceptance cutoff point remaining in the project schedule. Ninety-
for choosing between the two sample two risk mitigation action items and 23
distributions. Its value, positive or negative, contingency plans were created to offset the
reflects the shift in the team's criterion for risk associated with 33 of the 52 tasks. The
accepting false alarms and incorrect risk mitigation action items were never
rejections, the two types of errors that can be implemented for two of the tasks, leaving 31
made. As c shifts, the team is adjusting its tasks for which intervention was planned
level of acceptance of incorrect rejections or and taken. Twenty-nine of the 52 tasks were
false alarms. When c = 0, the team has been estimated to be of low risk and no
able to minimize the acceptance of both. c intervention was planned or taken.
translates to observer bias, or the tendency to Table IV is the contingency table for
yea-say or nay-say. Both risk assessment and Project T's risk assessment data. It compares
risk intervention decisions are considered the project team's estimates of risk for the 52
unbiased when c = 0. tasks with the post-implementation review
Because c indexes the tendency to say no, team's determination of the actual degree of
when c > 0 the bias is towards nay-saying or failure or success for the same tasks.
denying that the risk exists. When c < 0 the Table V is the contingency table for Project
bias is towards yea-saying or seeing risk T's risk intervention data. It contrasts the
[ 298 ]
John F. McGrew and project team's intervention actions for the 52 during risk assessment but underestimated
John G. Bilotta tasks with the post-implementation review the risk during intervention. In contrast,
The effectiveness of risk team's determination of the degree of failure
management: measuring what Team T tended to minimize the risk during
didn't happen or success for the same tasks. Table V makes assessment but to overreact to the risk
Management Decision clear that the team was able to control the during intervention. Bias results of this type
38/4 [2000] 293±300 risk associated with only 15 of the 31 tasks for are an indicator of systemic problems within
which it intervened ± the other 16 tasks in an organization and a zeitgeist in which
which the team intervened failed. On the teams publicly deny problems but scramble
other hand, the risk management team was behind the scenes to correct them. This is
successful in predicting that 12 of the 27 tasks frequently a consequence of a ``no negative
which did not fail did not require feedback'' culture.
intervention. Although p(c), the raw percent correct
Tables VI and VII summarize the shown in the third column of Table X, has the
assessment and intervention results of the
advantage of being more familiar to most
risk management program established by
people than d', it is only accurate in
Project N for Phase I of its project. Fifteen
situations where there is no bias in the
critical tasks were reviewed by the team and
observer's judgments. So long as c is near
the outcome, at the end of Phase I, was
zero, the difference in interpretation between
assessed by a quality assurance manager.
p(c)unb and p(c) is small, but if there is bias,
Tables VIII and IX summarize the
assessment and intervention results of whether positive or negative, the difference
Project N's risk management program for between p(c)unb and p(c) will grow. For
Phase II of its project. Twenty-eight critical instance, a comparison of the raw and
tasks were reviewed by the team and again unbiased percent correct show that if
the outcome was assessed by a quality ±0.30 < c < +0.30, then the raw percent correct
assurance manager. is fairly accurate. If bias is greater than ‹0.30,
The values of d' and c were calculated for the raw percent correct underestimates the
each contingency table using Equations 3 and actual percent correct. Team N, Phase I, had
4. The raw and unbiased estimates of the a p(c) of 47 per cent for its risk intervention
percent correct were calculated using efforts but a p(c)unb of 62 per cent ± a 15 per
Equations 5 and 6. The results for Project T cent underestimation. Team N, Phase II, had
and Project N, Phases I and II, are a p(c) of 57 per cent for risk intervention but a
summarized in Table X. p(c)unb of 99 per cent ± a 42 per cent
The d' values in Table X show that both underestimation. This is not surprising
teams were effective at assessing risk. Team T given that the bias is 0.45 in the first case and
had a d' of 1.81 in risk assessment and Team N, 3.15 in the second. In contrast, all three teams
Phase I, had a d' of 0.80 and in Phase II a d' of show little difference between p(c) and p(c)unb
12. Both teams were less effective at risk for risk assessment where their bias scores
intervention. Team T had a d' of 0.37. Team N, all fall at approximately ‹0.30. Thus, it can be
Phase I, had a risk intervention of 0.30 and in seen that p(c) can give seriously erroneous
Phase II a d' of 5.70. The assessment and estimates of a risk management program's
intervention d' values for Team N show a effectiveness.
strong learning effect from Phase I to Phase II. Team T made only one pass at risk
The c values in Table X indicate that Team management so we can say nothing about its
N overestimated the risk during both ability to learn from experience. However,
assessment and intervention for Phase I. In Team N shows clearly that this technique is
Phase II, it provided an unbiased estimate most valuable in its repetition. By using risk
management and SDT analysis from one
Table X
phase to the next, the team is able to build on
Skill and bias estimates for Project T and Project N, Phases I and II
its past experiences to improve its future
p(c) p(c)unb performance ± and the metrics d', c, and
d' c ``raw % ``Unbiased p(c)unb provide the means to quantify and
``Skill'' ``Bias'' correct'' % correct'' demonstrate for management the level of
Project T improvement and value of a risk
Risk assessment 1.81 0.32 81 82 management program.
Risk intervention 0.37 ±0.34 54 57
Project N, Phase I
Risk assessment 0.80 ±0.20 60 66 Discussion
Risk intervention 0.30 ±0.45 47 62
Is it possible to measure what you may have
Project N, Phase II
prevented from happening? The answer is
Risk assessment 12.00 0.00 100 100
yes. Ultimately, the effectiveness of a team's
Risk intervention 5.70 3.15 57 99
risk management program is probably best
[ 299 ]
John F. McGrew and represented by the effectiveness of its software development projects. For the
John G. Bilotta intervention strategy. In our opening project teams in this study, we have some
The effectiveness of risk discussion, a team following a strategy of
management: measuring what results indicating that teams inexperienced
didn't happen intervention to minimize risk has no way of in risk management can be very effective in
Management Decision judging the effectiveness of its efforts. It is managing risk if only by their own reports to
38/4 [2000] 293±300 unable to do so because it is not possible to the authors of their satisfaction with the
separate the set of all successful tasks into results of the program. An accumulation of
those that owe their success to intervention data from a number of projects will help
and those that do not. Signal detection establish the true range of risk control.
theory, however, enables us to separate them Because p(c)unb is monotonically related to d',
out by looking at the correct and incorrect either d' or p(c)unb can be used to support the
rejections. The prerequisites to doing this are direct comparison of results across widely
that potential risks be identified and tracked, different projects.
and an independent ``after-action'' review of
the results be done. References
When arranged in contingency tables and Bilotta, J.G. (1995), ``A study in risk
subjected to SDT analysis, the data can be management'', presented at the Bay Area
used to extract estimates of a team's skill and Software Process Improvement Network, 18
bias in assessing and intervening in risks. January.
SDT also provides a means to calculate an Boehm, B. (1989), Software Risk Management,
unbiased estimate of the percent correct, a IEEE Computer Society Press, Los Alamitos,
direct indicator of effectiveness. The CA.
unbiased percent correct p(c)unb is derived Charette, R.N. (1989), Software Engineering Risk
from the value of d' and has the advantage of Analysis and Management, McGraw-Hill, New
being more easily and intuitively understood York, NY.
by those less familiar with SDT, especially if Jones, C. (1994), Assessment & Control of Software
percent correct is a measure being used to Risks, Prentice-Hall, Englewood Cliffs, NJ.
report the effectiveness of risk management Macmillan, N.A. and Creelman, C.D. (1991),
programs. In the context of risk management, Detection Theory: A User's Guide, Cambridge
either d' or p(c)unb is a valuable and unbiased University Press, New York, NY.
estimator of effectiveness. McGrew, J.F. (1983), ``Human performance
The positive news is that even teams modeling using the signal detection
inexperienced in risk management are able paradigm'', presented to the Human Factors
Society, Norfolk, VA.
to assign risk estimates with a remarkable
McGrew, J.F. (1994), ``Measuring the success of
degree of accuracy. The risk assessment team
the SEI model using signal detection theory:
for Project T had never participated in a risk
an exploratory evaluation'', presented to the
management program and was able to
Software Engineering Process Group
identify correctly the risk associated with 82
National Meeting, Dallas, TX, 27 April.
per cent of the tasks. On the other hand,
McGrew, J.F. and Jones, W. (1976), ``Likelihood
controlling risk through intervention is more
ratio shift as an indicator of motivational
difficult to do. Although Project T could shift in multiple choice test items'', presented
identify the correct level of risk 82 per cent of to the annual meeting of the Midwestern
the time, it was able to mitigate only 57 per Association of Behavior Analysis, Chicago,
cent of the identified risk. Team N increased IL, 4 May.
its ability to assess risk between Phase I, Putnam, L.H. and Myers, W. (1992), Measures for
where p(c)unb = 66 per cent, and Phase II, Excellence, Prentice-Hall, Englewood Cliffs,
where p(c)unb = 100 per cent. Its risk NJ.
intervention skill improved from 62 per cent Swets, J.A. (1996), Signal Detection Theory and
in Phase I to 99 per cent in Phase II. ROC Analysis in Psychology and Diagnostics:
We have no way of knowing if the Collected Papers, Lawrence Erlbaum
effectiveness ratings for risk assessment and Associates, Mahwah, NJ.
mitigation for the projects in this study are US Air Force (1988), Software Risk Abatement,
above or below average relative to other AFCS/AFLC Pamphlet 800-45.

Application questions
1 What is your organisation's risk 2 Which of your projects do you feel are the
management plan? Are there any areas most high risk? Do you think this idea
which you think might need addressing could help you?
based on the authors' arguments/
discussions?

[ 300 ]

Вам также может понравиться