07 10 09 EDF Validation All 2007 PDF

SEPTEMBER 10, 2007
POWER AND LEVEL VALIDATION OF

MOODYS KMV EDF CREDIT MEASURES IN
NORTH AMERICA, EUROPE, AND ASIA
MODELINGMETHODOLOGY
AUTHORS
Irina Korablev
Douglas Dwyer
ABSTRACT
In this paper, we validate the performance of Moodys KMV EDF credit measures in its
timeliness of default prediction, ability to discriminate good firms from bad firms, and
accuracy of levels in three regions: North America, Europe, and Asia. We focus on the period
19962006 for most of our tests. Wherever possible, we compare the performance to that of
other popular alternatives, such as agency ratings, Moodys KMV RiskCalc EDF credit
measures, Altmans Z-Scores, and a simpler version of the Merton model. We find that EDF
credit measures perform consistently well across different time horizons, and different
subsamples based on firm size and credit quality. Our tests indicate that EDF credit measures
provide a very useful measure of credit risk that can be applied throughout the world.
Copyright 2007, Moodys KMV Company. All rights reserved. Credit Monitor, CreditEdge, CreditEdge Plus,
CreditMark, DealAnalyzer, EDFCalc, Private Firm Model, Portfolio Preprocessor, GCorr, the Moodys KMV logo,
Moodys KMV Financial Analyst, Moodys KMV LossCalc, Moodys KMV Portfolio Manager, Moodys KMV Risk
Advisor, Moodys KMV RiskCalc, RiskAnalyst, Expected Default Frequency, and EDF are trademarks owned by of MIS
Quality Management Corp. and used under license by Moodys KMV Company.
Published by:
Moodys KMV Company
To Learn More
Please contact your Moodys KMV client representative, visit us online at www.moodyskmv.com, contact
Moodys KMV via e-mail at info@mkmv.com, or call us at:
NORTH AND SOUTH AMERICA, NEW ZEALAND AND AUSTRALIA, CALL:
1 866 321 MKMV (6568) or 415 874 6000
EUROPE, THE MIDDLE EAST, AFRICA AND INDIA, CALL:
44 20 7280 8300
FROM ASIA CALL:
813 3218 1160
TABLE OF CONTENTS
1
INTRODUCTION .................................................................................................. 5
CREDIT RISK ASSESSMENT APPROACHES ........................................................ 5
2.1
Moodys KMV EDF Credit Measures ......................................................................................... 6
2.2
Agency Ratings .......................................................................................................................... 6
2.3
Moodys KMV RiskCalc EDF Credit Measures .......................................................................... 6
2.4
Mertons Structural Model........................................................................................................ 6
2.5
Altmans Z-Score ...................................................................................................................... 7
EMPIRICAL METHODOLOGY ............................................................................... 8

3.1
Timely Default Prediction.......................................................................................................... 8
3.2
Default Predictive Power .......................................................................................................... 9
3.3
Level Validation with Default Data ............................................................................................ 9

3.3.1 Interpreting the Analytical Outputs for Level Validation .............................................. 9
3.4
Level Validation with CDS Data ............................................................................................... 11
3.5
Median EDF by Rating Category across Regions.................................................................... 11
EMPIRICAL RESULTS ....................................................................................... 12

4.1
North America ......................................................................................................................... 12

4.1.1 Data.............................................................................................................................. 12
4.1.2 Timely Default Prediction U.S. ................................................................................. 13
4.1.3 Default Predictive Power U.S. .................................................................................. 14
4.1.4 Accuracy of Levels U.S. ............................................................................................ 21
4.1.5 Timely Default Prediction Outside the U.S............................................................... 29
4.1.6 Default Predictive Power Outside the U.S. .............................................................. 30
4.1.7 Accuracy of Levels Outside the U.S.......................................................................... 31
4.1.8 Conclusion ................................................................................................................... 32
4.2
Europe ..................................................................................................................................... 32
4.2.1 Diversity in Bankruptcy Mechanisms and Creditor Protection .................................. 32
4.2.2 Data.............................................................................................................................. 34
4.2.3 Timely Default Prediction............................................................................................ 36
4.2.4 Default Predictive Power ............................................................................................ 36
4.2.5 Level validation with default data ............................................................................... 38
4.2.6 Level Validation with CDS Data ................................................................................... 41
4.2.7 Conclusion ................................................................................................................... 43
4.3
Asia .......................................................................................................................................... 44
4.3.1 Data.............................................................................................................................. 44
4.3.2 Timely Default Prediction............................................................................................ 45
4.3.3 Default Predictive Power ............................................................................................ 46
4.3.4 Level Validation ........................................................................................................... 48
4.3.5 Conclusion ................................................................................................................... 50
4.4
Median EDF by Rating Category across Regions.................................................................... 50
CONCLUSION.................................................................................................... 51
APPENDIX B:
SUMMARY OF ACCURACY RATIOS FOR EDF CREDIT MEASURES AND
AGENCY RATINGS BY YEAR ........................................................................................ 54
INTRODUCTION
The new Basel Capital Accord states: The methodology for assigning credit assessments must be rigorous, systematic,
and subject to some form of validation based on historical experience. There are two important components to this
validation process: the ability to predict defaults and the accuracy of the default predictive measure.
The first criterion implies that a credit measure should be dynamic enough to be a meaningful and timely signal of
deteriorating credit quality or an impending credit event. In this regard, the Basel Accord states: Assessments must be
subject to ongoing review and responsive to changes in financial condition. Before being recognized by supervisors, an
assessment methodology for each market segment, including rigorous back-testing, must have been established for at
least one year. This also means that the credit assessment technology should have the ability to distinguish between
defaulters and non-defaulters. It should not allow defaulters to enter the sample while trying to create a sample of good
quality firms (Type I Error). Conversely, it should not exclude good quality firms from the sample while trying to
exclude potential defaulters (Type II Error).
The second criterion is focused on the accuracy of the credit assessment measure so that it can be useful to banks and
other financial institutions in their efforts toward risk measurement, valuation, and capital allocation. The Basel Accord
states: Banks must have a robust system in place to validate the accuracy and consistency of rating systems processes,
and the estimation of PDs (Probabilities of Default).
The objective of this document is to compare the performance, based on the above validation criteria, of EDF credit
measures with some of the other popular credit assessment approaches. The popular approaches that we consider are the
following:
Agency ratings
RiskCalc U.S. v3.1 private firm model
A Simple Merton structural model
Altmans Z-Score
In this paper we present our test results for three regions: North America, Europe and Asia. The rest of the paper is
organized as follows: Section 2 discusses briefly the credit assessment approaches that we consider in our paper. Section 3
highlights the empirical methodology we follow to compare the approaches. Section 4 presents the results of our tests by
1
region and interprets the economic meaningfulness of these results. Section 5 concludes the paper.
CREDIT RISK ASSESSMENT APPROACHES

The credit risk assessment approaches considered in this paper are:
Moodys KMV EDF credit measures
Agency ratings
Moodys KMV RiskCalc private firm model
Mertons structural model
Altmans Z-Scores2
In the following section we briefly discuss each of the approaches.
Section 4.1 presents the results for North America, section 4.2 presents the results for Europe, and section 4.3 presents the results for
Asia.
2
For reasons explained in the next two sections, not all the approaches can be subjected to tests on all the criteria. We try to include as
many of these approaches as possible in our test of each criterion.
2.1
Moodys KMV EDF Credit Measures
The structural view on credit risk was first made commercially viable with the introduction of the Vasicek-Kealhofer
(VK) model. This model offers a rich framework that treats equity as a perpetual down-and-out option on the
underlying assets of the firm. This framework incorporates five different classes of liabilities: short-term liabilities, longterm liabilities, convertible debt, preferred shares, and common shares. To overcome the regular problems encountered
by structural models due to the assumption of normality, the VK model uses an empirical mapping based on actual
3
default data to get the default probabilities, known as EDF credit measures and offered by Moodys KMV. Volatility is
estimated through a Bayesian approach that combines a comparables analysis with an iterative approach.
EDF credit measures are the outputs of Moodys KMV Credit Monitor and CreditEdge applications. An EDF credit
measure is a quantitative measure of credit quality. More specifically, an EDF credit measure is an estimate of the
physical probability of default for a given firm. For an overview of the EDF credit measure, see Crosbie and Bohn
(2003).
In 2007, Moodys KMV released EDF 8.0, which refines the mapping of the Distance-to-Default to the EDF credit
measure using a much larger default database observed over a longer time period. Details of the new model enhancement
can be found in Dwyer and Qu (2007).
The EDF estimates are now bounded between 0.01% (for an EDF value of 0.01) and 35% (for an EDF value of 35).
Moodys KMV offers a term-structure of EDF credit measures for 1 to 10 years and an extrapolation scheme to get
shorter-term EDF credit measures. The risk free rate used in the calculation of EDF credit measures is now updated
monthly.
2.2
Agency Ratings
Moodys Investors Service, Standard and Poors Corporation, and other well-known rating agencies around the world
have been assigning credit ratings to major borrowers for decades. These are ordinal measures of credit measures (i.e.,
they help rank firms by their quality of credit). These ratings have established international credibility because of the
long history of rating agencies, and the extensive testing of their relative performance.
2.3
Moodys KMV RiskCalc EDF Credit Measures
Moodys KMV RiskCalc is designed to calculate EDF credit measures for private companies. Private companies are
typically smaller than public companies and are not required to file financial statements with SEC.
The RiskCalc model incorporates aspects of both the structural, market-based approach in the form of industry-level
distance-to-default measures, and the localized financial statement-based approach. While it incorporates equity market
information at the aggregate level, RiskCalc does not take advantage of the equity information of the specific company.
We used the RiskCalc v3.1 U.S. model to obtain RiskCalc EDF credit measures for the set of publicly traded companies.
Comparing public firm EDF credit measures to RiskCalc EDF credit measures computed on public firms represents an
out-of-universe test of RiskCalc.
2.4
Mertons Structural Model
The Merton model of risky debt is the original structural model of credit risk, and perhaps the most significant
contribution to the area of quantitative credit risk research. This model assumes that equity is a call option on the value
of assets of the firm. From this insight, the value of debt can be derived based on the observed equity value. The default
event is modeled as the firms asset value falling below a threshold level (i.e., default barrier). Given the default barrier,
and the asset value parameters, the probability of default can be estimated for various horizons. A detailed description of
4
this model can be found in most standard finance textbooks.
3
4
See Eom, Helwege, and Huang (2003) for details of the discussion.
See, for example, Hull (1999).
For our specific tests, the model has been implemented as:
Default Pointi,Merton = Short Term Liabilities + 0.5 Long Term Liabilities
The default probability for a firm i for a time horizon t is computed as:
AVLi
2
ln
+ ( i 0.5 i ) t
Default
Point
i,Merton
PDi =
i t
i = iequity
EVLi EVLi
AVLi AVLi
(1)
(2)
EVLi = AVLi ( d1 ) Default Point i,Merton e rt ( d 2 )
(3)
AVLi
2
ln
+ ( r + 0.5 i ) t
Default
Point
i,Merton
d1 =
i t
d 2 = d1 i t
i , iequity , AVL , and EVL

i
are the asset volatility, equity volatility, asset value and equity value of firm i, respectively.
(x) is the cumulative normal distribution function. i is the drift rate for the asset returns of firm i while r is the
riskless rate of return.
iequity is computed as the standard deviation of three years of weekly equity returns for company
5
i. Asset value AVLi is computed by solving equations (2) and (3) simultaneously.
2.5
Altmans Z-Score
Altmans Z-Score came as a response to the need for identifying the financial health of any business based on observable
accounting and market ratios. This original measure was developed in 1968 by Edward Altman, whose Z-Score is
available in various forms. We chose the public firm form, which includes market capitalization in the leverage ratio, and
calculated Z-Scores as follows:
In contrast to the two equations and two unknowns, we use an iterative approach to solve for empirical volatility which is combined
with modeled volatility in a Bayesian fashion.
Z = ( X 1 + X 2 + X 3 + X 4 + X 5 )
(4)
Where
X 1 = 1.2
CurrentLiabilities
BookAssetValue
is the ratio of Current Liabilities to Total Assets;

X 2 = 1.4
Retained Earnings
Book Asset Value
is the Profitability Ratio;

X 3 = 3.3
Operating Income before Depreciation

Book Asset Value
is the ratio of EBIDTA to Total Assets;
X 4 = 0.6
Market Capitalization
Book Value of Liabilities
is the ratio of Market Value of Equity to Book Value of Liabilities; and

X5 =
Sales
Book Asset Value
is the ratio of Sales to Total Assets.

The calculation typically produces a Z-Score between 5 and 10, with a high Z-Score implying a better credit quality
and lower chance of bankruptcy. Z-Scores are not interpreted directly as default probabilities and therefore work as
ordinal measures of financial health. Therefore, they cannot be used directly for valuation, quantitative risk assessment,
and capital allocation purposes.
EMPIRICAL METHODOLOGY
In this section, we describe the methodology we chose for tests of each criterion.
3.1
Timely Default Prediction
Timeliness measures how many months before impending credit event EDF credit measures give signal of deteriorating
credit quality. To test timeliness, we create a sample of defaulted firms, retaining monthly observations from 24 months
prior to default up to12 months after default. We compute the median EDF credit measure and the median Moodys
rating by months to default. We overlay and compare the median EDF credit measure and the median Moodys rating.
For testing timeliness against rating, we use the Moodys rating. To ensure that the measure has stood the test of time
and the rating grades and size, we also provide the analysis, wherever possible, for the subsets of data based on time
period:
19962000
2001 and beyond
3.2
Default Predictive Power
While a default predictive measure can be timely for warning of impending defaults, it may not be so effective in
distinguishing a good firm from a bad firm. The calibration of the model may be on the conservative side inflating the
default probability of all suspect names, of which some names might not be genuinely distressed. In this case, even
though one could claim that the model performed well in predicting impending defaults, it would be fairly mediocre in
its ability to distinguish good firms from bad firms. One of the essential features of a good model is that it should be
sophisticated enough to differentiate bad (genuinely distressed) firms from good (false alarms) firms. There are two wellknown approaches to testing a model for its power:
Cumulative Accuracy Profile (CAP) with its output known as Accuracy Ratio (AR).
Receiver Operating Characteristic (ROC) with its output known as Area Under Curve (AUC).
Typically, the larger the Accuracy Ratio or Area Under Curve, the better the model. In extreme cases, a totally random
model that bears no information on impending defaults has AR = 0, and AUC = 0.5. For a perfect model,
AR = AUC = 1. The two approaches are equivalent with AR = 2AUC-1. A more detailed discussion can be found in
Appendix A.
In this article, we use the Cumulative Accuracy Profile approach, and provide AR as our output. We compared EDF
credit measures to:
Ratings
RiskCalc EDF credit measure
Simple Merton model
Altmans equity-based Z-Score.
3.3
Level Validation with Default Data
The level validation of EDF credit measures verifies how well the models predicted default rates track realized default
rates. We employ the same methodology described in Bohn, Arora and Korablev (2005) which was first developed in
Kurbat and Korablev (2002). The procedure is summarized into the following four steps:
1.
Using Monte Carlo technique, we simulate asset value movements based on a single factor Gaussian model to
capture correlated defaults.
2.
We determine default/non-default state based on the level of each firms EDF credit measure and each simulation
outcome.
3.
We compare the actual default rate to the median, 10th percentile and 90th percentile of the simulated distribution.
4.
We compute the probability of observing a default rate less than or equal to the realized default rate given the model
and the correlation coefficient.
We extend this methodology by using Bayesian methods to compute the posterior distribution of the aggregate shock
given the realized default rate, the model, and the correlation coefficient. The extension to the original methodology is
developed in Dwyer (2007).
3.3.1 Interpreting the Analytical Outputs for Level Validation

We create two graphs as an output to the level validation test. Figure 1 is the illustrative example of the output, and is
the comparison of the median predicted (by simulation) default rate and realized default rate. The median predicted
default rate is the black line. Red line represents the actual default rate. Fifty percent of the time the actual default rate
should be above (or below) the median. We also show the mean of predicted default rate, which is the blue line. Most of
the time the actual default rate should be below the average predicted default rate. The two gray lines correspond to the
prediction interval which represents the range of variability that is expected in the realized default rates given the EDF
values and the assumed correlation model. This prediction interval implies that eighty percent of the time the realized
6
default rate should lie within the 10th and the 90th percentiles.
The actual default rate should lie within

the 10th and 90th percentile 80% of the
time.
The actual default rate.
The average predicted default rate. Most of
the time the actual default rate should be
below this average.
The median predicted default rate. Fifty
percent of the time the actual default rate
should be above (or below) the median.
FIGURE 1 Illustrative example of the level validation output. Comparison of median

predicted default rate and realized default rate.
This prediction interval differs from the concept of a confidence interval. An x% confidence interval is random interval for which
the probability of it holding the true value of a parameter is x%. In our context here, an x% prediction interval has the interpretation
that x% of the time the realized default rate will be within this range given the EDFs levels and the correlation model.
10
P-value measures the probability

of observing a default rate at or
lower than the actual default rate
Median value of the aggregate shock

given the actual default rate
FIGURE 2 Illustrative example of the level validation output. Posterior distribution

of the aggregate shock and P-value of the actual default rate
The figure depicts the posterior distribution for the aggregate shock that was derived given the realized default rate, the model and the
correlation coefficient. We also computed the P-value of the actual default rate, which is the probability of observing a default at or
lower than the actual default rate. This P-value is shown as a blue line.
3.4
Level Validation with CDS Data
This test analyzes the level bias in European EDF credit measures relative to that of U.S. EDF credit measures. The
rationale for the test is based on the assumption that similar risks should offer similar premium in the U.S. and Europe.
We compare the median as well as 25th and 75th percentile CDS levels of two regions: U.S. and Europe across
EDF-implied rating groups. The same EDF categories should have same aggregate median spreads in CDS market across
two regions. We used Mark-It composite CDS data from January 2003 to December 2006. The Europe region is based
on the following currency information: Euro, Austrian Schilling, Belgian Franc, Swiss Franc, Czech Republic Koruna,
Deutsche Mark, Danish Kroner, Spanish Peseta, Finnish Markka, French Franc, Greek Drachmae, Hungarian Forint,
and British Pound. The U.S. region is based on the U.S. dollar.
3.5
Median EDF by Rating Category across Regions
We calculate and compare median EDF credit measures for North American non-financial companies, Asian-Pacific
non-financial companies, European non-financial companies and global financial companies by several rating categories.
In the absence of other measures of credit risk, e.g., spreads or defaults, a comparison with rating provides a sanity check
on the rank ordering of risk produced by the EDF credit measure and the comparableness of level of the EDF credit
measure across geographies.
11
EMPIRICAL RESULTS
In this section, we describe empirical results.
4.1
North America
In this section, we describe empirical results obtained in North America. Results are separated into U.S. and North
American companies that are headquartered outside of the U.S. These companies are predominantly headquartered in
Canada, Bermuda and the Cayman Islands.
4.1.1 Data
We start with all U.S. firms that have publicly traded equity from 19962006, unless otherwise specified. We restrict the
7
sample to non-financial firms with more than $30 million in size. For level validation we impose further restriction of
$300 million in size.
We also present results for comparable North American firms that are outside of the U.S. (Canada, Bermuda, Cayman
Islands, Bahamas, Belize, Panama, Virgin Islands, and Netherlands Antilles). Table 1 shows the countries and the
number of firm-months in each country that constitute North American module in Credit Monitor and CreditEdge.
Outside of the U.S., the largest countries are Canada, Bermuda and the Cayman Islands.
TABLE 1
Countries in the North American Database

Country
Number of Observations
(firm-month)
Netherlands Antilles
776
Bahamas
440
Belize
Bermuda
Canada
85
3,552
153,971
Cayman Islands
975
Panama
245
USA
Virgin Islands
1,127,452
491
For all comparison against ratings, we used Moodys ratings.

Defaults are based on the Moodys KMV Default database and include missed payments, distressed exchanges, and
insolvency proceedings. The defaults have been collected on a daily basis for more than ten years using a variety of
8
printed and on-line sources. By the end of 2006, we had about 7,900 public defaults worldwide. About 5,600 defaults
were from North America.
Size is measured by the sales of the firm for non-financial firms. Wherever the firms total sales number was not available, we used
the book asset value of the firm. This number was further adjusted for inflation effect across years by adjusting the numbers to a
common denomination by using a deflation adjustor calculated internally at Moodys KMV.
8
To collect defaults, we use numerous printed and online sources from around the world on a daily basis. We use government fillings,
government agency sources, company announcements, news services, specialized default news sources and even sources within
financial institutions to ensure to the greatest extent possible that we find all defaults. We also keep evidences in electronic format so
that content can be easily verified. As a result, Moodys KMV has the most extensive default database for public firms.
12
4.1.2 Timely Default Prediction U.S.

In this section, we compare the performance of EDF credit measures against agency ratings in their ability to predict
timely defaults. Figure 3 demonstrates how the median EDF credit measure (represented by the solid black line) starts
rising 24 months before the actual default, while the median Moodys rating stays flat until 13 months before default,
and then shows a steep rise about 5 months before default. In that sense, the EDF credit measures seem to lead the
ratings. This is also helped further by the fact that the EDF credit measure is more continuous, and therefore one can see
a steady and continuous rise in the aggregate. Ratings, on the other hand, are discrete, and therefore one sees a step-like
function with flat stretches implying that this measure does not instantaneously pick up the most currently available
information.
To test for the robustness of the results, we further divided our data into the subperiods:
1996-2000
2001-2006
The period 19962000 is shown on the left panel of Figure 4, and the period 20012006 is shown on the right panel of
Figure 4. Both EDF credit measures and ratings start at a higher level 24 months prior to default in the latter half of the
sample. EDF credit measures continued to lead the agency rating in each subperiod, indicating that EDF credit measures
indeed provide a more timely warning of impending defaults.
EDF measure is
leading rating by
11 months
FIGURE 3 Comparison of median agency ratings with Moodys KMV EDF values for rated defaulted firms
in the U.S. from 2 years before default to 1 year after default between 1996 and 2006
13
FIGURE 4 Comparison of median agency ratings with Moodys KMV EDF values for rated defaulted firms
in the U.S. from 2 years before default to 1 year after default for subsamples: 19962000 (left panel)
and 20012006 (right panel)
4.1.3 Default Predictive Power U.S.

In this section, we compare the performance of EDF credit measures against agency ratings, Z-Scores, and a simple
Merton model in its ability to discriminate between good and bad firms. Our test statistic is the Accuracy Ratio as
defined earlier. We also show the plots of Cumulative Accuracy Profiles of these measures for various subsamples selected
using different horizons and size filters.
EDF Credit Measure vs. Agency Rating
Figure 5 shows the performance of EDF credit measures against ratings on the entire sample period of 19962006. By
design, this test is restricted to the sample of rated firms only. It is clear that the EDF credit measure performs better
than ratings on the entire sample period with their Accuracy Ratios at 0.88 and 0.75, respectively.
To ensure that the measure is robust in its performance across various time horizons, we divide our sample into two
subsets of data based on time periods:
19962000
20012006
We provide the analysis by three different size categories:
Size is greater than $30 million
Size is between $30 and $300 million
Size is greater than $300 million
14
FIGURE 5 Cumulative Accuracy Performance (CAP) curves comparing Moodys KMV EDF credit
measures and agency ratings for U.S. non-financial companies between 1996 and 2006. The Accuracy
Ratios for EDF measure and agency rating are 0.88 and 0.75, respectively.
Table 2 illustrates the results for the subsamples. We find that the EDF credit measure substantially outperforms ratings,
in all categories by at least 12%.
TABLE 2 Accuracy Ratios by category for EDF Credit Measures and
agency ratings for U.S. non-financial companies
EDF Credit
Measure
Ratings
19962006
0.88
0.75
19962000
0.87
0.75
20012006
0.88
0.75
19962006,
Size > $30 Million
0.88
0.75
19962006,
Size $30-$300 Million
0.75
0.57
19962006,
Size> $300 Million
0.89
0.76
Date
15
We also calculated Accuracy Ratios at the horizons longer than one year. The results are presented in Table 3. EDF
credit measures have more discriminatory power than agency ratings at all horizons, but the difference is smaller at
longer horizons.
TABLE 3 Accuracy Ratios of one- to five-year EDF credit measures and agency ratings
for U.S. non-financial companies between 1991 and 2006
EDF Credit
Measure
Ratings
Number of
Observations
Number of
Defaults
One-year EDF
credit measure
0.88
0.76
2031
354
Two-year EDF
credit measure
0.81
0.73
1926
374
Three-year EDF
credit measure
0.77
0.71
1917
385
Four-year EDF
credit measure
0.72
0.7
1892
400
Five-year EDF
credit measure
0.69
0.68
1850
404
The Accuracy Ratios (AR) for both the EDF credit measure and agency rating decreases with horizon. The difference between ARs
becomes more compressed at longer horizons.
Figure 6 and Figure 7 present the Accuracy Ratios for the EDF credit measure and agency rating by year at one- and
9
five-year horizons respectively. For each year, we used the EDF credit measure as of the last market day of the prior year
to predict default during the next one or five years.
At a one-year horizon, the EDF credit measure has better discriminatory power than agency rating in all years, except
1996, which had the least number of defaults. At a five-year horizon, the EDF credit measure also outperforms agency
rating in all years except 2000.
The numbers underlying Figures 6 and 7 are summarized in Tables 15th and 16th of Appendix B.
16
1.00
0.90
0.80
0.70
0.60
0.50
1996
1997
1998
1999
2000
2001
2002
1-Year EDF Credit Measure
2003
2004
2005
2006
Agency Rating
FIGURE 6 Accuracy Ratios for EDF credit measures and agency ratings for U.S. non-financial companies
by year at the one-year horizon
1.00
0.90
0.80
0.70
0.60
0.50
1991 1992
1993 1994 1995
1996 1997
1998 1999 2000
5-Year EDF Credit Measure
2001 2002
Agency Rating
FIGURE 7 Accuracy Ratios for EDF Credit Measures and agency ratings for U.S. non-financial
companies by year at the five-year horizon
EDF Credit Measure vs. Merton Default Probability and Z-Score
In this section we compare the performance of EDF credit measures to the Merton models implied default probabilities
and Z-Scores as described in Section 2. The sample period used is between 1996 and 2006. Unlike the rated firms,
which are usually larger and higher profile, some of the unrated firms can be very small and their defaults can go
unnoticed. In some cases, there can be some informal negotiations or bailouts, avoiding the default. These cases are likely
17
10
to contaminate our results. Therefore we filtered out very small firms (size < 30 million dollars) from our sample. For
the entire period 19962006, the results are shown in Figure 8. The results are presented on a joined sample of Z-Scores,
Merton default probabilities, and EDF credit measures, which require each of these values to be non-missing.
We find that the EDF credit measure substantially outperforms Merton default probability and Z-Score in terms of their
ability to discriminate good firms from bad firms with their Accuracy Ratios at 0.82, 0.72, and 0.66 respectively. We
further divide the sample into subsets of sizes 30 million dollars to 300 million dollars, and 300 million dollars and
above. In both cases, the EDF credit measure outperforms the Merton model and Z-Score, as shown in Table 4.
Once again, as a robustness check, we compared the performance of the two measures across the time horizons
1996-2000, and 20012006. The results are shown in Table 4. As expected, our results are fairly robust with EDF credit
measures outperforming Merton default probabilities and Z-Scores across both horizons.
FIGURE 8 Cumulative Accuracy Performance (CAP) curves comparing Moodys KMV EDF credit measures,
Merton default probability and Z-Scores for U.S. non-financial companies between 1996 and 2006.The
Accuracy Ratios for EDF measure, Merton Default Probability and Z-Score are 0.82, 0.72 and 0.66
respectively.
10
Size is measured by the sales of the firm for non-financial firms. Whenever the firms total sales number was not available, we used
18
TABLE 4 Summary of Accuracy Ratios across various size buckets and time horizons for EDF
credit measure, Merton default probability, and Z-Score for U.S. non-financial companies
EDF Credit
Measure
Z-Score
Merton Default
Probability
1996-2006,
Size >$30Mln
0.82
0.66
0.72
1996-2000,
Size >$30Mln
0.82
0.66
0.73
2001-2006,
Size >$30Mln
0.82
0.67
0.71
1996-2006,
Size $30-$300 Million
0.76
0.65
0.67
1996-2006,
Size> $300 Million
0.88
0.66
0.77
Date/Size
EDF Credit Measure vs. RiskCalc EDF Credit Measure

In this section we compare the performance of EDF credit measures to RiskCalc EDF credit measures calculated for
Public firms as described in Section 2. The sample period used was 19962006. As before, we filtered out very small
11
firms (size < 30 million dollars) from our sample. For the entire period 19962006, the results are shown in Figure 9.
We find that EDF credit measures have more discriminatory power than RiskCalc EDF credit measures, which we
expected because RiskCalc does not incorporate firm-specific equity market information. Their Accuracy Ratios are at
0.82 and 0.68 respectively. We further divided the sample into subsets of sizes of 30 million dollars to 300 million
dollars, and 300 million dollars and above. In both cases, EDF credit measures outperform RiskCalc EDF credit
measures, as shown in Table 5. Both measures perform better for larger firms.
Once again, as a robustness check, we compared the performance of the two measures across the time horizons
1996-2000, and 20012006. The results are presented in Table 5. As expected, our results are fairly robust with the
EDF credit measures outperforming the RiskCalc EDF credit measures across both horizons. The Accuracy Ratio of the
EDF credit measure is higher in the second period while Accuracy Ratio of the RiskCalc EDF stays the same.
11
Size is measured by the sales of the firm for non-financial firms. Wherever the firms total sales number was not available, we used
19
FIGURE 9 Cumulative Accuracy Performance (CAP) curves comparing Moodys KMV EDF credit
measures and RiskCalc EDF credit measures between 1996 and 2006 for U.S. non-financial
companies. The Accuracy Ratios for EDF measure and RiskCalc EDF measure are 0.82 and 0.68
respectively.
20
TABLE 5
Summary of Accuracy Ratios for EDF Credit Measures and RiskCalc EDF Credit
Measures for U.S. non-financial companies by different size buckets and time periods
EDF Credit
Measure
RiskCalc EDF
Credit Measure
1996-2006,
Size >$30 Million
0.82
0.68
1996-2000,
Size >$30 Million
0.81
0.68
2001-2006,
Size >$30 Million
0.83
0.68
1996-2006,
Size $30-300 Million
0.76
0.64
1996-2006,
Size>$300 Million
0.89
0.72
Date / Size
The EDF credit measure effectively discriminates between good and bad credits. It performed better than Z-Score,
RiskCalc for private firms applied for publics, and simple implementation of a Merton model. It leads rating changes in
predicting defaults and it performs well across multiple cuts of the data and multiple horizons.
4.1.4 Accuracy of Levels U.S.

The test for this criterion draws from the methodology used by Korablev and Kurbat (2002), and Bohn, Arora and
Korablev (2005), which is described in Section 3. We also extended this methodology by using Bayesian methods to
compute the posterior distribution of the aggregate shock given the realized default rate, the model and the correlation
coefficient as described in Dwyer (2007).
The other alternatives of credit risk measurement cannot be directly interpreted as physical default probabilities, or
provide a framework that can account for the underlying correlations between assets. Therefore they cannot be compared
12
against EDF credit measures for the level test. Secondly, we have issues of hidden defaults or missing defaults for
smaller firms, as explained in Kurbat and Korablev (2002). Therefore, consistent with that paper, we restrict this test to
firms of size 300 million and above.
We first present results broken down by coarser levels of the EDF credit measure, then repeat the analysis for narrower
ranges of the measure.
Results for Firms with EDF Values Below 35%
In the previous validation studies (Kurbat and Korablev (2002), Bohn, Arora and Korablev (2004)), the test was
performed on the EDF 7.1 model, which was capped at 20%. In that case the predicted number of defaults was likely to
underestimate the realized number of defaults due to the truncation effect. Therefore we divided our sample into two:
EDF credit measures less than 20% and EDF credit measures equal to 20%.
One of the main features of the EDF 8.0 model is the new cap of 35%. Now we can expect that the truncation effect
would lessen or even disappear. Nevertheless, to be consistent with the previous studies we decided to split the sample
into two: firms with EDF values less than 35% (3500 bps) and firms with EDF values equal to 35%. The comparison
for the sample of firms with EDF values less than 35% is shown in Figure 10. The left panel of Figure 10 displays mean,
median predicted (by simulation) and actual default rate for EDF values below 35% along with 80% confidence set for
the predicted default rate. We used an asset correlation of 0.19 to simulate defaults in each year. The right panel of the
12
The exception to this is the Merton model but the default probabilities are too low as implied by the Merton model, and therefore it
would usually underestimate the predicted number of defaults.
21
Figure 10 presents the posterior distribution for the aggregate shock given the actual default rate and P-values of the
actual default rate, which is the probability of observing a default at or lower than the actual default rate.
The predicted default rate clearly tracks the realized default rate very well. All predicted default rates fall within the
confidence set. The exception is year 2003, which was an uncharacteristically good year for the economy leading to a
substantially lower number of defaults. In year 2003, to explain the low default rate, we estimate that the U.S. economy
received a positive 0.84 standard deviation shock relative to market expectations. Such a positive shock is consistent with
the high returns on the S&P 500 observed during that year. The P-values of the realized default rate range from 21% to
75%, which is within the sampling variability that would be expected.
FIGURE 10
Comparison of median predicted default rate with the realized default rate, 19912006
The sample was restricted to U.S. firms larger than 300 million dollars and EDF credit measure less than 35%. We used an asset
correlation of 0.19 to simulate defaults in each year. On the left panel, the gray lines represent the 80% prediction interval for realized
default rate, the black line is the median predicted default rate, the blue line is the mean predicted default rate, and the red dotted line
is the realized default rate. The right panel shows the aggregate shock distribution and P-values. The dark black line, rm50 is the
median for the posterior distribution of the aggregate shock; the grey lines, rm10 and rm90 are the 10th and 90th percentiles; the blue
line is the P-value of the actual default rate, which is the probability of observing a default at or lower than the actual default rate.
We summarize the numbers that underlie Figure 10 in Tables 6 and 7. Table 6 contains the number of firms, number of
defaults, median and mean predicted default rate per year as well as the 10th and 90th percentiles for predicted default
rate. It is clear from this table that the correlation effect skews the distribution of default rates to the left. If we ignored
this effect and had simply taken the mean default rate of the sample, we would have grossly over-predicted the realized
default rate. Table 7 contains the median aggregate shock, the 10th and 90th percentiles of the aggregate shock, and the
p-value by year.
22
TABLE 6
Comparison of mean and median predicted default rate with the realized
default rate between 1991 and 2006
Mean
Predicted
Default
Rate
Median
Predicted
Default
Rate
1991
2.3%
1.7%
1992
1.4%
1993
Realized
Default
Rate
10th
percentile
90th
percentile
2.5%
0.5%
4.9%
1554
39
1.0%
1.0%
0.2%
3.2%
1549
15
1.3%
0.9%
0.9%
0.2%
2.9%
1639
15
1994
1.1%
0.7%
0.6%
0.1%
2.5%
1775
10
1995
1.2%
0.7%
0.9%
0.1%
2.6%
1847
16
1996
1.2%
0.8%
0.9%
0.2%
2.8%
1906
17
1997
1.2%
0.8%
0.8%
0.2%
2.6%
2054
17
1998
1.1%
0.7%
0.9%
0.1%
2.4%
2114
20
1999
1.8%
1.2%
1.0%
0.3%
3.9%
2106
22
2000
2.6%
1.9%
1.9%
0.5%
5.5%
2042
38
2001
3.6%
2.8%
2.7%
0.8%
7.3%
1783
48
2002
2.5%
1.9%
1.8%
0.5%
5.4%
1707
31
2003
3.0%
2.3%
1.0%
0.6%
6.2%
1635
16
2004
1.2%
0.8%
0.7%
0.2%
2.8%
1699
12
2005
0.8%
0.5%
1.0%
0.1%
1.9%
1806
18
2006
0.7%
0.4%
0.2%
0.1%
1.5%
1835
Year
Firms
Defaults
The sample was restricted to U.S. firms larger than 300 million dollars with EDF credit measures less than 35%.
23
TABLE 7 Summary table of aggregate shock and year-wise probability of realizing the
actual number of defaults between 1991 and 2006
Year
10th Percentile
Median
Aggregate Shock
90th Percentile
Probability of having
actual defaults or
even lower
1991
-0.64
-0.42
-0.19
68.7%
1992
-0.28
0.03
0.34
51.7%
1993
-0.37
-0.07
0.23
57.9%
1994
-0.16
0.19
0.52
47.3%
1995
-0.42
-0.13
0.15
58.9%
1996
-0.35
-0.06
0.23
54.7%
1997
-0.35
-0.06
0.22
57.7%
1998
-0.52
-0.25
0.01
64.2%
1999
-0.11
0.15
0.40
47.7%
2000
-0.20
0.02
0.23
51.3%
2001
-0.17
0.04
0.24
49.5%
2002
-0.21
0.03
0.26
52.0%
2003
0.54
0.84
1.13
21.3%
2004
-0.20
0.13
0.45
51.4%
2005
-0.86
-0.58
-0.31
74.5%
2006
-0.02
0.46
0.92
45.1%
The sample was restricted to U.S. firms larger than 300 million dollars with EDF credit measures less than 35%.
Results for Firms with EDF Values Equal to 35%

Figure 11 shows the median predicted and actual number of defaults for EDF credit measures of 35%. We used 0.181 as
an asset correlation for pairs of firms in each year to simulate defaults. The companies in this sample are, on average,
somewhat less correlated with each other than the set of firms with EDF credit measures of less than 35%. We find that
the realized default rate ranges from 11% to 67%. The high default rate in 1998 is indicative of a large negative shock
which is shown in Figure 11 along with the P-values of the realized default rate. The P-values range from 8% to 93%,
which is within the sampling variability that would be expected over a 15-year period.
24
FIGURE 11
The sample was restricted to U.S. firms larger than 300 million dollars and EDF credit measure equal to 35%. We used an asset
correlation of 0.181 to simulate defaults in each year. On the left panel, the gray lines represent the 80% prediction interval for
realized default rate, the black line is the median predicted default rate, the blue line is the mean predicted default rate, and the red
dotted line is the realized default rate. The right panel shows the aggregate shock distribution and P-values. The dark black line, rm50
is the median for the posterior distribution of the aggregate shock; the grey lines, rm10 and rm90 are the 10th and 90th percentiles; the
blue line is the P-value of the actual default rate, which is the probability of observing a default at or lower than the actual default rate.
We summarize the numbers that underlie Figure 11 in Tables 8 and 9. Table 8 contains the number of firms, the
number of defaults, the median and mean predicted default rate per year, as well as the 10th and 90th percentiles for the
predicted default rate. It is clear from this table that the correlation effect skews the distribution of default rates to the
left. Table 9 contains the median aggregate shock, the 10th and 90th percentiles of the aggregate shock, and the P-value by
year.
25
TABLE 8
Comparison of mean and median predicted default rate with the realized
default rate between 1991 and 2006
Mean
Predicted
Default
Rate
Median
Predicted
Default
Rate
1991
35.0%
33.4%
1992
35.0%
1993
Realized
Default
Rate
10th
Percentile
90th
Percentile
40.0%
12.7%
59.7%
30
12
33.4%
24.0%
12.3%
60.3%
25
35.0%
33.5%
33.3%
11.8%
60.9%
21
1994
35.0%
33.5%
11.8%
11.2%
61.9%
17
1995
35.0%
33.7%
15.4%
10.4%
63.5%
13
1996
35.0%
33.6%
12.5%
11.0%
62.2%
16
1997
35.0%
34.2%
11.1%
9.2%
67.1%
1998
35.0%
33.6%
66.7%
10.8%
62.6%
15
10
1999
35.0%
33.4%
35.1%
13.1%
59.2%
37
13
2000
35.0%
33.5%
38.3%
13.6%
58.8%
47
18
2001
35.0%
33.5%
40.2%
14.5%
57.8%
107
43
2002
35.0%
33.5%
41.0%
13.9%
58.4%
61
25
2003
35.0%
33.5%
38.0%
14.1%
58.2%
71
27
2004
35.0%
33.4%
23.1%
12.4%
60.1%
26
2005
35.0%
33.5%
10.0%
11.7%
61.1%
20
2006
35.0%
33.5%
16.7%
11.4%
61.6%
18
Year
Firms
The sample was restricted to U.S. firms larger than 300 million dollars and EDF credit measure equal to 35%.
26
Defaults
TABLE 9
Year
Summary table of aggregate shock and year-wise probability of realizing the actual
number of defaults between 1991 and 2006
Median
Aggregate Shock
10th Percentile
The probability of
having actual defaults
or even lower
90th Percentile
1991
-0.86
-0.30
0.24
63.1%
1992
-0.21
0.44
1.06
30.6%
1993
-0.65
0.00
0.63
50.0%
1994
0.13
0.94
1.76
10.8%
1995
-0.14
0.69
1.54
17.0%
1996
0.07
0.88
1.71
12.0%
1997
-0.21
0.72
1.68
12.4%
1998
-1.96
-1.23
-0.53
92.8%
1999
-0.60
-0.08
0.42
53.8%
2000
-0.71
-0.24
0.22
60.2%
2001
-0.68
-0.35
-0.04
64.4%
2002
-0.79
-0.38
0.03
65.7%
2003
-0.63
-0.24
0.15
60.0%
2004
-0.15
0.49
1.11
28.7%
2005
0.30
1.08
1.89
8.0%
2006
-0.03
0.73
1.48
17.9%
The sample was restricted to U.S firms larger than 300 million dollars with EDF credit measures equal to 35%.
Results by EDF Subgroups

To test the robustness of our results, we further divide the sample of firms with EDF values less than 35% into smaller
groups. EDF buckets that we used along with correlation for default simulation in each bucket are presented in
Table 10.
TABLE 10
Stratum
EDF buckets
EDF Range
Correlation
Number
of Firms
0.015-
0.191
22887
0.177
1264
0.192
442
512
1235
Figures 12, 13, and 14 show the median, mean, and the prediction interval for the realized default rate and actual default
rate for EDF values in the range [0.02, 5), [5,12), and [12,35), respectively. It is clear from these figures that while the
predicted and realized default rates can deviate from each other in certain years, there is no substantial bias in their levels
over the long run. In general, the two levels track each other very well. All predicted default rates fall within the
27
prediction interval. Year 2003 was an uncharacteristically good year for the economy leading to a substantially lower
number of defaults in two of the three subgroups.
FIGURE 12
Comparison of median predicted default rate with the realized default rate, 1991- 2006
The sample was restricted to U.S. firms larger than 300 million dollars and EDF credit measure between 0.01% and 5%. We used an
asset correlation of 0.191 to simulate defaults in each year. On the left panel, the gray lines represent the 80% prediction interval for
realized default rate, the black line is the median predicted default rate, the blue line is the mean predicted default rate, and the red
FIGURE 13
Comparison of median predicted default rate with the realized default rate, 1991 - 2006
The sample was restricted to U.S. firms larger than 300 million dollars and EDF credit measure between 5% and 12%. We used an
asset correlation of 0.177 to simulate defaults in each year. On the left panel, the gray lines represent 80% prediction interval for
predicted default rate, the black line is the median predicted default rate, the blue line is the mean predicted default rate, and the red
dotted line is the realized default rate. The right panel shows the posterior distribution of the aggregate shock and P-values. The dark
black line, rm50 is the median for the posterior distribution of the aggregate shock; the grey lines, rm10 and rm90 are the 10th and 90th
28
percentiles; the blue line is the P-value of the actual default rate, which is the probability of observing a default at or lower than the
actual default rate.
FIGURE 14
Comparison of median predicted default rate with the realized default rate, 1991- 2006
The sample was restricted to U.S. firms larger than 300 million dollars and EDF credit measure between 12% and 34.99%. We used
an asset correlation of 0.192 to simulate defaults in each year. On the left panel, the gray lines represent the 80% prediction interval
for realized default rate, the black line is the median predicted default rate, the blue line is the mean predicted default rate, and the red
4.1.5 Timely Default Prediction Outside the U.S.

The Timeliness test outside the U.S. produces very similar results to those in the U.S. The median EDF credit measure
starts rising 24 months before the actual default, while the median rating rises 18 months before default from B2 to B3,
then stays flat until six months before default at which point it rises sharply. EDF credit measures clearly lead ratings.
29
FIGURE 15
Comparison of median agency ratings with Moodys KMV EDF values for defaulted firms
from two years before default to one year after default for North American companies outside the
U.S. and sample period between 1996 and 2006
4.1.6 Default Predictive Power Outside the U.S.

In this section, we compare the performance of Moodys KMV EDF credit measures Z-Scores and a simple Merton
model in its ability to discriminate between good and bad firms. We do not perform a power test against Agency Rating
because of the small number of rated defaults outside the U.S.
EDF Credit Measure vs. Merton Default Probability and Z-Score
In this section we compare the performance of EDF credit measures to the Merton model and Z-Scores as described in
Section 2. The sample period used was 19962006. We filtered out very small firms (size < 30 million dollars) from our
13
sample as we did in the case of U.S. companies. Results for the entire period 19962006, are shown in Figure 16. The
results are presented as a sample of Z-Scores, Merton Default Probabilities, and EDF credit measures. All three values
should be non-missing to be included in the sample.
We find that the EDF credit measure outperforms Merton Default Probability and Z-Score as a more effective statistic
to discriminate good firms from bad firms with their Accuracy Ratios at 0.78, 0.70 and 0.65, respectively. Because of the
sample size, we do not divide the sample into two subsamples as we did in the U.S.
13
30
FIGURE 16
Cumulative Accuracy Performance (CAP) curves comparing EDF credit measures,
Merton Default Probability and Z-Scores between 1996 and 2006 for North American companies
outside the U.S. The Accuracy Ratios for the EDF credit measure, Merton Default Probability and
Z-Score are 0.78, 0.70 and 0.65 respectively.
4.1.7 Accuracy of Levels Outside the U.S.

Figure 17 presents the level validation results for the sample of firms with EDF credit measures below 35%. The left
panel of the Figure 17 displays mean, median predicted, and actual default rate as well as 80% confidence set for
predicted defaults. We used an asset correlation of 0.19 to simulate defaults in each year. The right panel of Figure 17
displays the posterior distribution for the aggregate shock given the actual default rate and P-values of the actual default
rate, which is the probability of observing a default at or lower than the actual default rate.
Predicted default rate tracks the realized default rate very well. Realized default rate fluctuates around median predicted
default rate. In all years, except 1991, predicted default rates fall within the confidence set. Year 1991 was a good year,
leading to a lower number of defaults. In year 1991, to explain the low default rate, we estimate that the U.S. economy
received a positive 0.82 standard deviation shock relative to market expectations. P-values are between 8% and 75%
which is in the range we expect over 15-year period.
31
FIGURE 17
The sample was restricted to North American firms outside the U.S. larger than 300 million dollars and EDF credit measure less than
35%. We used an asset correlation of 0.19 to simulate defaults in each year. On the left panel the gray lines represent 80% prediction
interval for the realized default rate, the black line is the median predicted default rate, the blue line is the mean predicted default rate
and red dotted line is the realized default rate. The right panel shows the posterior distribution of the aggregate shock and P-values.
The dark black line, rm50 is the median of the aggregate shock; and the grey lines, rm10 and rm90, are the 10th and 90th percentiles
for the aggregate shock; the blue line is the P-vale of the actual default rate, which is the probability of observing a default at or lower
than the actual default rate.
4.1.8 Conclusion
Results obtained for the North American sample show that the EDF credit measure leads the agency rating in timely
default prediction. The EDF credit measure leads other alternative measures in its ability to discriminate good firms from
bad firms over time and across various subsections of the data. We also showed that the model predicted default rates
track realized default rates well and the model works well not only in the U.S., but also in North America excluding the
U.S.
4.2
Europe
In this section, we describe the results obtained in Europe.
4.2.1 Diversity in Bankruptcy Mechanisms and Creditor Protection

Bankruptcy mechanisms can differ between regions. For example, Davydenko and Franks (2005) found that while the
British bankruptcy mechanism is designed to be extremely creditor friendly, the French system is geared toward
14
protecting a business as a going concern even at the expense of its creditors. While interpreting the validation results,
it is important to understand the impact of these mechanisms on the outcome of the model. For example, if a system is
too creditor-friendly, the creditors can pressure the firm at the slightest hint of distress. This action may cause a firm to
file for bankruptcy sooner, although the recovery for creditors may be higher. On the other hand, if the system is too
geared toward protecting a firm, the creditors may not be allowed to take a firm to court even if it is in severe distress.
14
A brief description of the similarities and differences among British, French, and German bankruptcy mechanisms is provided in
Korablev (2005).
32
A second characteristic is the nature of debt in an economy. A creditor-debtor relationship might be close (as in Japan),
or at arms length (as in the U.S.). If the creditors are few and have a close relationship with the debtor, they are more
likely to evaluate the long-term potential of the debtor before taking it into bankruptcy. If the creditors are scattered,
there is a higher likelihood of a free-rider problem, leading to a forced bankruptcy even if the debtor may have some
long-term positive potential.
In general, we see an equal contribution of non-bankruptcy defaults and bankruptcies in North America, while the
15
European cases of distress are dominated by bankruptcies as shown in Figure 18. This may be influenced by two
factors. First, in many economies within Europe, the debt is held more closely relative to that in the U.S., making it
more likely to enter private renegotiations of debt and avoid default during times of a liquidity crunch. Second, many
cases of defaults may not be covered by the media, and are in that sense hidden. These two factors should not be
applicable to larger firms because their debt is usually widely held, and they are followed more closely by media.
Figure 19 compares the percentage representation of default cases in Europe and North America by size over the period
of 19962006. Defaults as a fraction of total distress cases are substantially smaller in Europe for small and mid-sized
firms. Larger firms, however, have more comparable default behavior across Europe and North America. This shows that
the model validation is more reliable on the sample of large companies because of the quality of data on actual defaults.
North America
Europe
100%
100%
80%
80%
60%
60%
40%
40%
20%
20%
0%
0%
1996
1997
1998
1999
2000
2001
Bankruptcy
FIGURE 18
2002
2003
2004
2005
1996
2006
1997
1998
1999
2000
2001
Bankruptcy
Defaults
2002
2003
2004
2005
2006
Defaults
Percentage representation of defaults and bankruptcies in North American

and European Markets between 1996 and 2006
North America
Europe
80%
70%
70%
60%
60%
50%
50%
40%
40%
30%
30%
20%
20%
10%
10%
0%
0%
1996
1997
1998
1999
Size < 30 Million
FIGURE 19
2000
2001
2002
2003
30 Million <= Size <= 300 Million
2004
2005
Size > 300 Million
2006
1996
1997
1998
1999
Size < 30 Million
2000
2001
2002
2003
30 Million <= Size <= 300 Million
2004
2005
2006
Size > 300 Million
Default events as a percentage of all distress cases across three size buckets
between 1996 and 2006
15
The following events constitute non-bankruptcy defaults: missed interest or principle payment, distressed extension of a loan,
distressed exchange offer, delay in paying substantial portion of trade debt, and government takeover of financial institution to prevent
market collapse.
33
The success of a model relies on the ability of the inputs to take regional nuances into account. A model whose inputs are
not universal in concept may have more difficulty capturing the differences in characteristics of the system in which it is
being implemented. As long as the economic fundamentals of a model are universal in nature, it is not necessary to
interpret its output differently across different regions. For the Moodys KMV EDF model, one of the main drivers is
asset value, which is inferred from the equity value and an underlying structural framework. The model should work well
for data from individual regions and for data pooled across them because the equity markets take into account the
regional differences.
The extent to which different equity markets accurately reflect firm value and volatility has implications for the power
and the level performances of the model. In fact, even if a model is powerful in discriminating defaulters from
non-defaulters in different regions, but is off in its level performance, the aggregation of data across regions will make the
model seem less powerful. For example, if a distance-to-default (DD) of 2 corresponds to an EDF credit measure of 5%
in the U.K., but 2% in France, then an aggregation of data would incorrectly suggest that both a U.K. firm and a French
firm with a DD of 2 correspond to the same rank in our test. In that sense, a default predictive power test on a dataset
aggregated across different regions essentially tests a joint hypothesis that the model is powerful and that the DD-to-EDF
mapping is similar across different regions. It could be the case that the model might be powerful in two regions
separately, but may appear less powerful if the data are aggregated.
Similarly, while testing for levels, one could imagine that the model had specified levels in two regions incorrectly,
overestimating the default rate in one region and underestimating it in the other. However, it may work well on the
aggregated dataset. Therefore, a reasonable level performance on aggregated data is a necessary, but not a sufficient, test
for the level performance of the model in each region. Unfortunately, there is an insufficient number of defaults available
to perform a reliable level test in each subregion of Europe.
4.2.2 Data
We start with all European firms that have publicly traded equity between 1996 and 2006. The sample was then
16
restricted to non-financial firms with more than $30 million in size to avoid missing and hidden default problem. For
level validation we imposed a further restriction of $300 million in size.
16
Following our practice in North America, size is measured by the sales of the firm for non-financial firms. Whenever the firms total
sales number was not available, we used the book asset value of the firm. This number was further adjusted for inflation effect across
years by adjusting the numbers to a common denomination by using the appropriate consumer price index and exchange rate.
34
TABLE 11
Number of companies by country in the European Module of

Credit Monitor and CreditEdge
Country
Code
Size >=
$30 Million
Size >=
$300 Million
Austria
AUT
112
60
Belgium
BEL
134
77
Switzerland
CHE
232
153
Czech Republic
CZE
75
30
Germany
DEU
796
394
Denmark
DNK
171
77
Spain
ESP
177
124
Finland
FIN
153
81
France
FRA
897
413
Great Britain
GBR
1959
788
Greece
GRC
256
66
Hungary
HUN
37
15
Ireland
IRL
76
37
Iceland
ISL
Israel
ISR
136
48
Italy
ITA
301
172
Luxemburg
LUX
31
23
Netherlands
NLD
238
152
Norway
NOR
232
90
Poland
POL
118
43
Portuguese
PRT
90
37
Russia
RUS
61
57
Slovakia
SVK
16
Slovenia
SVN
Sweden
SWE
311
134
Turkey
TUR
165
55
Country
We also present the results for level validation for subsample of countries that have more than 100 companies of size
$300 million. These countries include Switzerland, Germany, Spain, France, Great Britain, Italy, Netherlands, and
Sweden. The number of firms by country and size is shown in Table 11.
35
Defaults are based on the Moodys KMV Default database and include missed payments, distressed exchanges and
17
insolvency proceedings. For all comparisons against agency ratings we used Moodys ratings.
4.2.3 Timely Default Prediction

timely defaults according with methodology described in section 3.1. We create a sample of defaulted firms retaining
monthly observations from 24 months prior to default until 10 months after default. Only those observations were
included in the sample that had non-missing history of EDF credit measures and ratings 24 months prior to default. We
compute the median of the EDF credit measure and the median rating by months to default and overlay the median
EDF and the median rating.
Figure 20 demonstrates that in the event of default, EDF credit measures become elevated 11 months before ratings.
Ratings move later and more abruptly, giving the most signal in the last nine months.
FIGURE 20
Median agency ratings and Moodys KMV EDF values for rated defaulted firms in Europe
from 24 months before default to 10 months after default between 1996 and 2006
4.2.4 Default Predictive Power

EDF credit measures outperform simple Merton model implied default probabilities and Z-Scores in its ability to
discriminate between defaulters and non-defaulters, which can be seen from Figure 21. The Accuracy Ratios for the EDF
credit measure, Merton default probability, and Z-Score are 0.79, 0.70 and 0.61, respectively.
17
To collect defaults, we use numerous printed and online sources from around the world on a daily basis. We use government
fillings, government agency sources, company announcements, news services, specialized default news sources and even sources within
financial institutions to ensure, to the greatest extent possible that we find all defaults. We also keep evidences in electronic format so
that content can be easily verified. As a result, Moodys KMV has the most extensive default database for public firms.
36
We divide the sample into subsets of sizes $30 million to $300 million, and $300 million and above. In both cases the
EDF credit measure outperforms the Merton model implied default probability and Z-Score, as shown in Table 10. All
the measures improve for larger firms.
As a robustness check, we compared the performance of the three measures across time horizons 19962000 and
20012006. The results, presented in Table 10, illustrate that the EDF credit measure outperforms the Merton model
and Z-Score with EDF credit measure and Merton default probability performing better in 19962000 period while
Z-Score has higher Accuracy Ratio in the second period.
FIGURE 21
Cumulative Accuracy Profile curves (CAP) comparing Moodys KMV EDF credit measures,
Z-Scores and Merton default probabilities for European non-financial firms between 1996 and 2006.
The Accuracy Ratios for EDF measure, Z-Score and Merton default probability are 0.79, 0.61 and 0.70,
respectively.
We summarize our findings in this section in Table 12. The results clearly show that the EDF credit measure in Europe
outperforms the other popular alternative in its ability to discriminate good firms from bad firms at a 1-year horizon.
37
TABLE 12
Summary of Accuracy Ratios, across various size buckets and time periods for
European non-financial firms
EDF Credit
Measure
Z-Score
Simple Merton
Model
19962006,
Size >$30 Million
0.79
0.61
0.70
19962000
Size >$30 Million
0.79
0.53
0.71
20012006
Size >$30 Million
0.78
0.64
0.64
19962006,
Size between $30$300 Million
0.75
0.60
0.64
19962006,
Size>$300 Million
0.83
0.65
0.77
Date
4.2.5 Level validation with default data

To validate the accuracy of levels we followed the methodology described in Section 3.3.
Results for the Whole Sample
Figure 22 shows the level validation results for the sample of European firms with size greater than $300 million. The
left panel of the Figure 22 displays the mean, median predicted and actual default rate along with 80% prediction
interval for the default rate. We used an asset correlation of 0.25 to simulate defaults in each year. The right panel of the
Figure 22 presents the posterior distribution for the aggregate shock given the actual default rate and P-values of the
actual default rate, which is the probability of observing default at or lower than the actual default rate.
The predicted default rate tracks the realized default rate well. There are exceptions, however, during times of systematic
shock. For example, 2002 was a year when the markets crashed and there were an unexpectedly high number of defaults
compared to what was predicted by the model. We estimated that the shock was negative 0.29 standard deviations.
Similarly, year 2003 was an uncharacteristically good year for the economy leading to a substantially lower number of
defaults. The graph of the aggregate shocks shows that in year 2003 the economy experienced a positive shock of 1.02
standard deviations that led to that small default rate.
The results show that all realized default rates fall within the prediction interval. The P-values of the realized default rate
range from 17% to 65%, which is within the sampling variability that would be expected.
38
FIGURE 22 Comparison of median predicted default rate with the realized default rate, 19962006
The sample was restricted to European non-financial firms larger than 300 million dollars. We used an asset correlation of 0.25 to
simulate defaults in each year. On the left panel, the gray lines represent the 80% prediction interval for realized default rate, the black
line is the median predicted default rate, the blue line is the mean predicted default rate, and the red dotted line is the realized default
rate. The right panel shows the aggregate shock distribution and P-values. The dark black line, rm50 is the median for the posterior
distribution of the aggregate shock; the grey lines, rm10 and rm90 are the 10th and 90th percentiles; the blue line is the P-value of the
We summarize the numbers that underlie Figure 22 in Tables 12 and 13. Table 12 contains number of firms, number of defaults,
median and mean predicted default rate per year, as well as 10th and 90th percentiles for predicted default rate. We find that the mean
predicted default rates are much larger than the median default rates indicating that the correlation effect skews the distribution of
default rates to the left. If we ignored this effect and had simply taken the mean default rate of the sample, we would have falsely
concluded that the model over predicts defaults. Table 13 contains the median aggregate shock, 10th and 90th percentiles of the
aggregate shock and the P-value by year.
39
TABLE 13
Comparison of mean and median predicted number of defaults with the realized
number of defaults between 1996 and 2006
Mean
Predicted
Default
Rate
Median
Predicted
Default
Rate
1996
0.87%
0.40%
1997
0.90%
1998
Realized
Default
Rate
10th
Percentile
90th
Percentile
0.44%
0.00%
2.00%
1596
0.50%
0.37%
0.00%
2.10%
1610
0.67%
0.30%
0.38%
0.00%
1.40%
1588
1999
0.98%
0.50%
0.35%
0.00%
2.20%
1692
2000
0.92%
0.50%
0.38%
0.00%
2.10%
1580
2001
1.34%
0.80%
0.88%
0.10%
3.10%
1360
12
2002
1.92%
1.20%
1.70%
0.20%
4.40%
1409
24
2003
3.09%
2.20%
0.66%
0.50%
6.80%
1513
10
2004
1.65%
1.00%
0.77%
0.20%
3.80%
1563
12
2005
1.14%
0.70%
0.31%
0.10%
2.60%
1627
2006
0.47%
0.20%
0.06%
0.00%
0.80%
1607
Year
Firms
Defaults
The sample was restricted to European firms larger than 300 million dollars.
TABLE 14
Year
Summary of aggregate shock and year-wise probability of realizing the

actual number of defaults between 1996 and 2006
10th Percentile
Median
Aggregate Shock
90th Percentile
1996
-0.30
0.04
0.37
56.09%
1997
-0.16
0.21
0.57
47.61%
1998
-0.44
-0.07
0.28
60.46%
1999
-0.08
0.28
0.62
46.15%
2000
-0.18
0.18
0.53
48.31%
2001
-0.37
-0.08
0.21
55.76%
2002
-0.52
-0.29
-0.06
64.51%
2003
0.70
1.02
1.32
17.02%
2004
-0.04
0.26
0.54
42.95%
2005
0.15
0.54
0.92
38.48%
2006
-0.05
0.57
1.17
49.24%
The sample was restricted to the European firms larger than 300 million dollars.
40
Probability of having
actual defaults or
even lower
Results for Countries having at Least 100 Companies with Size Greater than $300 Million
We restricted the sample to the countries that have at least 100 companies with size greater than $300 million. These
countries tend to have larger equity markets. For these companies, the predicted default rate tracks the realized default
rate very well as shown in Figure 23. The relatively low default rate in year 2003 is indicative of a large positive shock.
The P-values of the realized default rate range from 20% to 69%, which is within the sampling variability that would be
expected.
FIGURE 23
The sample was restricted to European non-financial firms larger than 300 million dollars from the following countries: Switzerland,
Germany, Spain, France, Great Britain, Italy, Netherlands, and Sweden. We used an asset correlation of 0.25 to simulate defaults in
each year. On the left panel, the gray lines represent the 80% prediction interval for realized default rate, the black line is the median
predicted default rate, the blue line is the mean predicted default rate, and the red dotted line is the realized default rate. The right
panel shows the aggregate shock distribution and P-values. The dark black line, rm50 is the median for the posterior distribution of
the aggregate shock; the grey lines, rm10 and rm90 are the 10th and 90th percentiles; the blue line is the P-value of the actual default
rate, which is the probability of observing a default at or lower than the actual default rate.
4.2.6 Level Validation with CDS Data

The number of defaults observed for larger firms in Europe was less than in the North America, making the power of the
test somewhat weaker compared to that in North America. Therefore, we present another indirect validation of EDF
credit measures in Europe. This test analyzes the level bias in European EDF credit measure relative to that of the U.S.
EDF credit measure. The rationale for the test is based on the assumption that similar risks offer similar premia in the
U.S. and Europe. So, if we subdivide the firms based on EDF categories, then the same EDF categories should have same
aggregate median spreads in CDS markets across the two regions.
For example, if EDF levels in Europe substantially overstated the level of default risk in Europe relative to North
America, then if we were to compare a European firm to a North American firm with a comparable EDF, the European
firm on average would have a substantially lower CDS spread. Conversely, if there were no such systematic bias between
EDF credit measures in North America versus Europe, then the median spread should be approximately the same.
In Figure 24, we compare the median, 25th and 75th percentile CDS spreads for Aa and above and A EDF implied rating
categories. The median spreads as well as 25th and 75th percentiles over time are comparable in the U.S. and Europe,
thereby indicating no relative bias in EDF levels of Europe over that in the U.S.. We also tried this for Baa, Ba, B and
41
12
Caa EDF implied rating categories and found comparable results. The results are shown in Figure 25, and 26
respectively. There was some overlap in the underlying names in the two currencies. However, our findings are robust to
using a completely non-overlapping sample as well. The subinvestment names can be impacted by liquidity risk that can
be different in different regions, thereby making the test less reliable.
Aa and above
FIGURE 24
Comparison of CDS spreads in the U.S. and Europe for Aa and above and A EDF-implied
rating categories
Blue lines represent 25th, median and 75th percentile of the CDS spread in Europe and red lines are similar data for the U.S.
12
The category Aaa is not shown because there were very few observations for CDS contracts in this category.
42
Baa
FIGURE 25
Ba
Comparison of CDS spreads in the U.S. and Europe for Baa and B EDF-implied rating
categories
FIGURE 26
Caa
Comparison of CDS spreads in the U.S. and Europe for B and Caa EDF implied rating
categories
4.2.7 Conclusion
We showed that in Europe, EDF credit Measures lead Agency Ratings in timely default prediction. EDF credit measures
lead other alternative measures in their ability to discriminate good firms from bad firms over time and across various
43
subsections of the data. Model-predicted default rates track realized default rates well and CDS spreads are similar to
those in the U.S. for the same EDF-implied rating categories.
4.3
Asia
In this section, we describe the results obtained in Asia.
4.3.1 Data
We start with all Asian firms that have publicly traded equity from 1996 to 2006. We restrict the sample to nonfinancial firms with more than $30 million in size (unless otherwise specified) to account for hidden or missing
18
defaults. Defaults are based on the Moodys KMV Default database and include missed payments, distressed exchanges,
and insolvency proceedings.
Table 14 shows the number of companies by country for two size categories: above $30 million and above $300 million
that are in Asian module of Credit Monitor and CreditEdge.
We decided to exclude some countries from level validation:
China, because the government intervention default definition is not clear
Australia and New Zealand, because they belong to the Pacific region
Japan, because it has a different economic structure and a hidden default problem
Pakistan and Sri Lanka, because they have a small number of companies
The remaining countries have the most comprehensive default coverage. These countries are Hong Kong, India,
Indonesia, Korea, Malaysia Philippines, Singapore, Thailand, and Taiwan. We ran power and level validation tests
separately for Japan.
18
44
TABLE 15
Number of companies in Asian Module of Credit Monitor and CreditEdge

by country and size
Country
Code
Size >=
$30 Million
Size >=
$300 Million
Australia
AUS
844
258
China
CHN
1357
385
Hong Kong
HKG
695
231
Indonesia
IDN
200
64
India
IND
567
174
Japan
JPN
3955
2274
Korea
KOR
899
377
Sri Lanka
LKA
19
Malaysia
MYS
643
135
New Zealand
NZL
101
47
Pakistan
PAK
83
24
Philippines
PHL
96
25
Singapore
SGP
494
129
Thailand
THA
360
79
Taiwan
TWN
1196
310
Country
4.3.2 Timely Default Prediction

timely defaults according to methodology described in section 3.1. We create a sample of defaulted firms retaining
monthly observations from 24 months prior to default until 10 months after default. Only those observations were
included in the sample that had non-missing history of EDF values and ratings 24 months prior to default. We compute
the median of the EDF credit measure and the median rating by months to default and overlay the median EDF and the
median rating.
Figure 27 demonstrates that in the event of default, EDF credit measures become elevated 10 months before ratings.
45
FIGURE 27 Median agency ratings and Moodys KMV EDF values for all rated defaulted firms in Asia from
24 months before default to 10 months after default between 1996 and 2006. EDF values are displayed
on log scale.
4.3.3 Default Predictive Power

The EDF credit measure has more discriminatory power than Z-Score and Merton Default Probability in Hong Kong,
India, Indonesia, Korea, Malaysia Philippines, Singapore, Thailand and Taiwan as can be seen in Figure 28. The
Accuracy Ratio for the EDF credit measure is 0.67. Contrary to the power tests performed in North America and
Europe, Z-Score outperforms simple Merton model implied default probability in its ability to discriminate between bad
and good firms with Accuracy Ratios being 0.57 and 0.56, respectively.
46
FIGURE 28 Cumulative Accuracy Profile (CAP) curves comparing Moodys KMV EDF credit measures and
Z-Scores for Asian non-financial companies between 10/2001 and 12/2006. The Accuracy Ratios for EDF
measure, Z-Score and Merton Default Probabilities are 0.67, 0.57 and 0.56, respectively.
The EDF credit measure has more discriminatory power than Z-Score and Merton Default Probability in Japan.
Consistent with the results in other nine countries, Z-Score has higher Accuracy ratio than Merton default probability.
CAP curves are presented in Figure 29. Accuracy ratios of EDF credit measure, Merton default probability and Z-Score
are 0.89, 0.79 and 0.77, respectively.
47
FIGURE 29 CAP curves comparing Moodys KMV EDF credit measures and Z-Scores for Japanese nonfinancial companies between 10/2001 and 12/2006. The Accuracy Ratios for the EDF credit measure,
Z-Score, and Merton Default Probabilities are 0.89, 0.79 and 0.77, respectively.
4.3.4 Level Validation

Figure 30 shows the level validation results for the sample of Asian firms (Hong Kong, India, Indonesia, Korea, Malaysia
Philippines, Singapore, Thailand and Taiwan) with size greater than $300 million. The left panel of Figure 30 displays
the mean, median predicted, and actual default rate along with 80% prediction interval for predicted default rate. We
used an asset correlation of 0.25 to simulate defaults in each year. The right panel of Figure 30 presents the posterior
distribution for the aggregate shock given the actual default rate and P-values of the actual default rate, which is the
probability of observing default at or lower than the actual default rate.
Collection of default data in Asia is more difficult than in the U.S. and Europe because of language barriers, poor
reporting of default events, and government intervention to prevent company collapse, which often goes unreported. We
could expect the under prediction of defaults in 1996, 1997, and 1998 because of the severe Asian financial crisis. The
over-prediction of defaults in 2001 and 2002 may reflect market uncertainties regarding the Asian recovery while Europe
and North America were in recessions. The P-values of the realized default rate range from 11% to 87%, which is within
the sampling variability that would be expected.
48
FIGURE 30
The sample was restricted to Asian non-financial firms larger than 300 million dollars from the following countries: Hong Kong,
India, Indonesia, Korea, Malaysia Philippines, Singapore, Thailand, and Taiwan. We used an asset correlation of 0.25 to simulate
defaults in each year. On the left panel, the gray lines represent 80% prediction interval for predicted default rate, the black line is the
median predicted default rate, the blue line is the mean predicted default rate, and the red dotted line is the realized default rate. The
right panel shows the posterior distribution of the aggregate shock distribution and P-values. The dark black line is the median
aggregate shock; the grey lines, rm10 and rm90 are the 10th and 90th percentiles for the aggregate shock; the blue line is the P-value of
the actual default rate, which is the probability of observing a default at or lower than the actual default rate.
Accuracy of Levels in Japan

Figure 31 presents the level validation results for the sample of Japanese firms with size greater than $300 million. As
above, the left panel of the Figure 31 displays the mean, median predicted, and actual default rate along with 80%
prediction interval for predicted defaults. We used an asset correlation of 0.25 to simulate defaults in each year. The
right panel of the Figure 31 presents the posterior distribution for the aggregate shock given the actual default rate and
P-values of the actual default rate, which is the probability of observing a default at or lower than the actual default rate.
As expected, the EDF credit measure is higher than observed default rate in Japan, due to the practice of banks and
parent companies extending credit to companies that otherwise would default.
49
FIGURE 31
Comparison of median predicted default rate with the realized default rate,19962006
The sample was restricted to Japanese non-financial firms larger than 300 million dollars. We used an asset correlation of 0.25 to
simulate defaults in each year. On the left panel, the gray lines represent the 80% prediction interval for realized default rate, the black
line is the median predicted default rate, the blue line is the mean predicted default rate, and the red dotted line is the realized default
rate. The right panel shows the aggregate shock distribution and P-values. The dark black line, rm50 is the median for the posterior
th
th
distribution of the aggregate shock; the grey lines, rm10 and rm90 are the 10 and 90 percentiles; the blue line is the P-value of the
4.3.5 Conclusion
We showed that in Asia, the EDF credit measures lead agency ratings in timely default prediction. For countries where
we have best default coverage, EDF credit measures lead other alternative measures in their ability to discriminate good
firms from bad firms over time and across various subsections of the data. Realized default rate for countries with better
default coverage lies within the prediction interval. In Japan, the EDF model discriminates distressed firms from healthy
firms very well.
4.4
Median EDF by Rating Category across Regions
Four panel graphs in Figure 32 display the median EDF credit measure by rating categories across different regions.
Levels of EDF credit measure for North American non-financial companies, Asia-Pacific non-financial companies, and
global financial companies are comparable for all rating categories. Levels in Europe are a bit lower for better quality
firms in rating categories of A and Baa.
50
100%
100%
Ba
Baa
Ba
Baa
10.0%
EDF8
EDF8
10.0%
1.00%
0.10%
1.00%
0.10%
0.01%
0.01%
01M90
01M95
01M00
01M05
01M10
01M90
North American non-financial companies
01M95
01M00
01M05
100%
100%
1.00%
0.10%
Ba
Baa
10.0%
EDF8
EDF8
Ba
Baa
10.0%
1.00%
0.10%
0.01%
0.01%
01M90
01M95
01M00
01M05
European non-financial companies
FIGURE 32
01M10
Asian-Pacific non-financial companies
01M10
01M90
01M95
01M00
01M05
01M10
Global financial companies
Comparison of median EDF across different regions by Moodys rating categories
CONCLUSION
In this document, we tested the performance of the Moodys KMV EDF credit measure in its timeliness of default
prediction, ability to discriminate good firms from bad firms, and accuracy of levels. Whenever possible, we compared
the performance to other popular alternatives available to the market.
We find that the EDF credit measure performed well on all counts over time and across various subsections of the data.
We also showed that the Moodys KMV model works well not only in North America, but also in Europe and Asia. In
Europe, our findings are especially significant because our European sample consisted of various subregions with
substantially different debt-holding structures and bankruptcy mechanisms which could have adversely impacted the
results if the model was not universal in concept.
While research at Moodys KMV continues to make efforts to make this measure superior and take into account all the
nuances of the data as the markets evolve and become more complex, we feel that as of now, this measure sets a standard
in the industry for a transparent and predictive absolute measure of the probability of default.
51
APPENDIX A: CAP VS. ROC

The most popular validation techniques available today are Cumulative Accuracy Profile (CAP) and Receiver Operating
Characteristic (ROC). CAP has its summary statistic known as the Accuracy Ratio, while ROC has its summary statistic
as the area under the ROC curve.
19
As a specific case, let us think of a sample that has N defaulters and M non-defaulters. The ith firm in the sample is
assigned a default probability pi. Without loss of generality, let us assume the order to be p1 p2p3.pM+N.
Therefore, for each pi, one can assign a set (m(pi),n(pi)), where m(pi) represents the number of non-defaulters that have
probability of default greater than or equal to pi, and n(pi) represents the number of defaulters that have probability of
default greater than or equal to pi. Obviously m(pM+N) = M and n(pM+N) = N. One can translate these numbers to fraction
of defaulters and non-defaulters as fm(pi) and fn(pi) where fm(pi) = m(p i ) represents the fraction of non-defaulters that have
M
a default probability greater than or equal to pi, and fn(pi) = n (p i ) represents the fraction of non-defaulters that have a
N
default probability greater than or equal to pi. Similarly, one can also create an overall fraction
f(pi) = m(p i ) + n (p i ) that represents the fraction of firms in the sample that have a default probability greater than or equal
M+N
to pi.
CAP is now defined as the graph of fn(pi) against f(pi) for all values of pi. ROC is the graph of fn(pi) against fm(pi) for all
values of pi. Alternatively, ROC can also be interpreted as the curve that plots the hit rate against the false alarm rate for
any cut-off C, across all values of C.
Receiving Operating Characteristics is a popular approach borrowed from medical science. ROC curves, also known as
power curves, are well-known ways of establishing the ability of a model to distinguish signals from noise, or in our case,
defaulters from non-defaulters. The basic approach takes a sample of M+N firms, of which M are good firms (nondefaulters) and N are defaulted firms. If we rank the firms in their likelihood of defaulting from the highest potential
defaults to lowest potential defaults, and exclude z% of the riskiest firms from the sample, then we will end up excluding
some actual defaults and some non-defaults. In this way, we end up excluding z/100*(M+N) of the firms.
A perfect model would have excluded all defaults, as long as z/100 is less than N/(M+N). A random model with no
information would exclude zM/100, i.e., z% of non-defaults and zN/100 (i.e., x% of defaults). Let us assume that we
exclude x(z)% of non-defaults, and y(z)% of defaults by excluding the riskiest z% of the sample. By varying z from 0 to
100, we can get various pairs of (x,y). By plotting y against x on an X-Y plot, we should be able to construct a graph that
indicates the Accuracy Ratio of the model. For example, a model with no predictive power should have its (x,y) plot as a
45 degree line. A perfect model should be a flat horizontal line at 100%. Note that both x and y will vary from 0 to
100%. Also, this test needs only ordinal ranking and therefore can be used to test all the various approaches of credit risk
on the same plane.
The area of the CAP curve above the 45-degree line as a fraction of the area of the perfect models CAP curve above the
45-degree line is called the Accuracy Ratio (AR). The area under the ROC curve is called the AUC (Area Under Curve).
This is illustrated in figures 33 and 34. Both of these figures are based on the same population of non-defaulters and
defaulters. Figure 33 shows the Cumulative Accuracy Profile Curve that is the curve outside area A. A perfect model
would have the Cumulative Accuracy Curve represented by the straight line outside area B. The Accuracy Ratio is
A/(A+B). Figure 34 shows the ROC curve outside the shaded area. The shaded area represents the AUC. It can be shown
that 2AUC-1 = AR. For more details and proof of this relationship, refer to Engelmann, Hayden, and Tasche (2003).
19
Defaulters are usually counted over a certain horizon. Therefore these tests are horizon specific. In this document, all tests are for 1year horizon.
52
FIGURE 33
FIGURE 34
53
APPENDIX B: SUMMARY OF ACCURACY RATIOS FOR EDF CREDIT

MEASURES AND AGENCY RATINGS BY YEAR
TABLE 16 Accuracy Ratios for EDF Credit Measures and agency ratings
for U.S non-financial companies by Year at 1-year Horizon
EDF Credit
Measure
Ratings
1996
(EDF: 12/95; Defaults: 1/96-12/96)
0.72
0.80
1997
(EDF: 12/96; Defaults: 1/97-12/97)
0.91
0.80
1998
(EDF: 12/97; Defaults: 1/98-12/98)
0.90
0.76
1999
(EDF: 12/98; Defaults: 1/99-12/99)
0.85
0.75
2000
(EDF: 12/99; Defaults: 1/00-12/00)
0.76
0.67
2001
(EDF: 12/00; Defaults: 1/01-12/01)
0.74
0.65
2002
(EDF: 12/01; Defaults: 1/02-12/02)
0.79
0.58
2003
(EDF: 12/02; Defaults: 1/03-12/03)
0.86
0.77
2004
(EDF: 12/03; Defaults: 1/04-12/04)
0.85
0.84
2005
(EDF: 12/04; Defaults: 1/03-12/03)
0.89
0.75
2006
(EDF: 12/05; Defaults: 1/06-12/06)
0.96
0.82
Date
54
TABLE 17 Accuracy Ratios for EDF Credit Measures and agency ratings
for U.S non-financial companies by year at 5-year Horizon
Date
1991
(EDF: 12/90; Defaults: 1/91-1/96)
1992
(EDF: 12/91; Defaults: 1/92-1/97)
1993
(EDF: 12/93; Defaults: 1/93-1/98)
1994
(EDF: 12/93; Defaults: 1/94-1/99)
1995
(EDF: 12/95; Defaults: 1/95-1/00)
1996
(EDF: 12/95; Defaults: 1/96-1/01)
1997
(EDF: 112/96; Defaults: 1/97-1/02)
1998
(EDF: 12/97; Defaults: 1/98-1/03)
1999
(EDF: 12/98; Defaults: 1/99-1/04)
2000
(EDF: 12/99; Defaults: 1/00-1/05)
2001
(EDF: 12/00; Defaults: 1/01-1/06)
2002
(EDF: 12/01; Defaults: 1/02-1/07)
EDF Credit
Measure
Ratings
0.77
0.73
0.84
0.81
0.79
0.76
0.67
0.66
0.68
0.58
0.68
0.62
0.66
0.62
0.67
0.64
0.65
0.65
0.61
0.64
0.65
0.61
0.71
0.60
We used EDF data up until 12/2002 and default data up until 12/2006.
55
REFERENCES
1.
Irina Korablev, 2005, Power and Level Validation of the EDF Credit Measure in the European Market.
2.
Jeff Bohn, Navneet Arora, & Irina Korablev, 2005, Power and Level Validation of the Moodys KMV EDF
Credit Measure in the U.S. Market.
3.
Agrawal, Deepak, Navneet Arora, and Jeffrey Bohn, 2004, Parsimony in Practice: An EDF-based Model of
Credit Spreads, Moodys KMV White Paper.
4.
Arora, Navneet, Jeffrey Bohn, and Fanlin Zhu, 2005, Reduced vs. Structural Models of Credit Risk: A Case
Study of Three Models, Moodys KMV Technical Document.
5.
Crosbie, Peter, and Jeffrey Bohn, 2003, Modeling Default Risk, Moodys KMV Technical Document,.
6.
Das, Ashish, Amnon Levy, Anil Gurnaney, Jeffrey Bohn, Peter Crosbie and Stephen Kealhofer, 2004,
Modeling Portfolio Risk, Moodys KMV Technical Document.
7.
Douglas W. Dwyer & Shisheng Qu, 2007, EDF 8.0 Model Enhancements.
8.
Douglas W. Dwyer, 2007, The Distribution of Defaults and Bayesian Model ValidationJournal of Model Risk
Validation, Volume 1, no 1.
9.
Engelmann, Berndt, Evelyn Hayden, and Dirk Tasche, 2003, Testing Rating Accuracy, Risk, January 2003.
10. Eom, Young Ho, Jean Helwege, and Jing Zhi Huang, 2003, Structural Models of Corporate Bond Pricing:
An Empirical Analysis, Review of Financial Studies.
11. Hull, John, 1999, Options, Futures and Other Derivatives, Prentice Hall Publications, Fourth Edition.
12. Johnson, Norman, Samuel Kotz, and Adrienne Kemp, 1993, Univariate Discrete Distributions, 2nd Ed., NY:
Wiley.
13. Kurbat, Matt, and Irina Korablev, 2002, Methodology for Testing the Level of the EDF Credit Measure,
Moodys KMV White Paper.
14. Lyden, Scott, and David Saraniti, 2000, An Empirical Analysis of Classical Theory of Corporate Security
Valuation, Research Paper, Barclays Global Investors.
56

07 10 09 EDF Validation All 2007 PDF

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

07 10 09 EDF Validation All 2007 PDF

Загружено:

Авторское право:

Доступные форматы

SEPTEMBER 10, 2007

POWER AND LEVEL VALIDATION OF

CREDIT RISK ASSESSMENT APPROACHES ........................................................ 5

Moodys KMV EDF Credit Measures ......................................................................................... 6

Agency Ratings .......................................................................................................................... 6

Moodys KMV RiskCalc EDF Credit Measures .......................................................................... 6

Mertons Structural Model........................................................................................................ 6

Altmans Z-Score ...................................................................................................................... 7

EMPIRICAL METHODOLOGY ............................................................................... 8

Timely Default Prediction.......................................................................................................... 8

Default Predictive Power .......................................................................................................... 9

Level Validation with Default Data ............................................................................................ 9

Level Validation with CDS Data ............................................................................................... 11

Median EDF by Rating Category across Regions.................................................................... 11

EMPIRICAL RESULTS ....................................................................................... 12

North America ......................................................................................................................... 12

Median EDF by Rating Category across Regions.................................................................... 50

RiskCalc U.S. v3.1 private firm model

A Simple Merton structural model

CREDIT RISK ASSESSMENT APPROACHES

Moodys KMV EDF credit measures

Moodys KMV RiskCalc private firm model

Mertons structural model

In the following section we briefly discuss each of the approaches.

Moodys KMV EDF Credit Measures

Moodys KMV RiskCalc EDF Credit Measures

Mertons Structural Model

EVLi = AVLi ( d1 ) Default Point i,Merton e rt ( d 2 )

i , iequity , AVL , and EVL

riskless rate of return.

is the ratio of Current Liabilities to Total Assets;

is the Profitability Ratio;

Operating Income before Depreciation

is the ratio of EBIDTA to Total Assets;

is the ratio of Market Value of Equity to Book Value of Liabilities; and

is the ratio of Sales to Total Assets.

Timely Default Prediction

2001 and beyond

Default Predictive Power

RiskCalc EDF credit measure

Simple Merton model

Altmans equity-based Z-Score.

Level Validation with Default Data

3.3.1 Interpreting the Analytical Outputs for Level Validation

The actual default rate should lie within

FIGURE 1 Illustrative example of the level validation output. Comparison of median

P-value measures the probability

Median value of the aggregate shock

FIGURE 2 Illustrative example of the level validation output. Posterior distribution

Level Validation with CDS Data

Median EDF by Rating Category across Regions

Countries in the North American Database

For all comparison against ratings, we used Moodys ratings.

4.1.2 Timely Default Prediction U.S.

4.1.3 Default Predictive Power U.S.

We provide the analysis by three different size categories:

Size is greater than $30 million

Size is between $30 and $300 million

Size is greater than $300 million

1-Year EDF Credit Measure

1993 1994 1995

1998 1999 2000