Вы находитесь на странице: 1из 56

SEPTEMBER 10, 2007

POWER AND LEVEL VALIDATION OF


MOODYS KMV EDF CREDIT MEASURES IN
NORTH AMERICA, EUROPE, AND ASIA
MODELINGMETHODOLOGY

AUTHORS
Irina Korablev
Douglas Dwyer

ABSTRACT
In this paper, we validate the performance of Moodys KMV EDF credit measures in its
timeliness of default prediction, ability to discriminate good firms from bad firms, and
accuracy of levels in three regions: North America, Europe, and Asia. We focus on the period
19962006 for most of our tests. Wherever possible, we compare the performance to that of
other popular alternatives, such as agency ratings, Moodys KMV RiskCalc EDF credit
measures, Altmans Z-Scores, and a simpler version of the Merton model. We find that EDF
credit measures perform consistently well across different time horizons, and different
subsamples based on firm size and credit quality. Our tests indicate that EDF credit measures
provide a very useful measure of credit risk that can be applied throughout the world.

Copyright 2007, Moodys KMV Company. All rights reserved. Credit Monitor, CreditEdge, CreditEdge Plus,
CreditMark, DealAnalyzer, EDFCalc, Private Firm Model, Portfolio Preprocessor, GCorr, the Moodys KMV logo,
Moodys KMV Financial Analyst, Moodys KMV LossCalc, Moodys KMV Portfolio Manager, Moodys KMV Risk
Advisor, Moodys KMV RiskCalc, RiskAnalyst, Expected Default Frequency, and EDF are trademarks owned by of MIS
Quality Management Corp. and used under license by Moodys KMV Company.

Published by:
Moodys KMV Company
To Learn More
Please contact your Moodys KMV client representative, visit us online at www.moodyskmv.com, contact
Moodys KMV via e-mail at info@mkmv.com, or call us at:
NORTH AND SOUTH AMERICA, NEW ZEALAND AND AUSTRALIA, CALL:
1 866 321 MKMV (6568) or 415 874 6000
EUROPE, THE MIDDLE EAST, AFRICA AND INDIA, CALL:
44 20 7280 8300
FROM ASIA CALL:
813 3218 1160

TABLE OF CONTENTS
1

INTRODUCTION .................................................................................................. 5

CREDIT RISK ASSESSMENT APPROACHES ........................................................ 5

2.1

Moodys KMV EDF Credit Measures ......................................................................................... 6

2.2

Agency Ratings .......................................................................................................................... 6

2.3

Moodys KMV RiskCalc EDF Credit Measures .......................................................................... 6

2.4

Mertons Structural Model........................................................................................................ 6

2.5

Altmans Z-Score ...................................................................................................................... 7

EMPIRICAL METHODOLOGY ............................................................................... 8


3.1

Timely Default Prediction.......................................................................................................... 8

3.2

Default Predictive Power .......................................................................................................... 9

3.3

Level Validation with Default Data ............................................................................................ 9


3.3.1 Interpreting the Analytical Outputs for Level Validation .............................................. 9

3.4

Level Validation with CDS Data ............................................................................................... 11

3.5

Median EDF by Rating Category across Regions.................................................................... 11

EMPIRICAL RESULTS ....................................................................................... 12


4.1

North America ......................................................................................................................... 12


4.1.1 Data.............................................................................................................................. 12
4.1.2 Timely Default Prediction U.S. ................................................................................. 13
4.1.3 Default Predictive Power U.S. .................................................................................. 14
4.1.4 Accuracy of Levels U.S. ............................................................................................ 21
4.1.5 Timely Default Prediction Outside the U.S............................................................... 29
4.1.6 Default Predictive Power Outside the U.S. .............................................................. 30
4.1.7 Accuracy of Levels Outside the U.S.......................................................................... 31
4.1.8 Conclusion ................................................................................................................... 32

4.2

Europe ..................................................................................................................................... 32
4.2.1 Diversity in Bankruptcy Mechanisms and Creditor Protection .................................. 32
4.2.2 Data.............................................................................................................................. 34
4.2.3 Timely Default Prediction............................................................................................ 36
4.2.4 Default Predictive Power ............................................................................................ 36
4.2.5 Level validation with default data ............................................................................... 38
4.2.6 Level Validation with CDS Data ................................................................................... 41
4.2.7 Conclusion ................................................................................................................... 43

4.3

Asia .......................................................................................................................................... 44
4.3.1 Data.............................................................................................................................. 44
4.3.2 Timely Default Prediction............................................................................................ 45
4.3.3 Default Predictive Power ............................................................................................ 46
4.3.4 Level Validation ........................................................................................................... 48
4.3.5 Conclusion ................................................................................................................... 50

4.4

Median EDF by Rating Category across Regions.................................................................... 50

CONCLUSION.................................................................................................... 51

APPENDIX B:
SUMMARY OF ACCURACY RATIOS FOR EDF CREDIT MEASURES AND
AGENCY RATINGS BY YEAR ........................................................................................ 54

INTRODUCTION
The new Basel Capital Accord states: The methodology for assigning credit assessments must be rigorous, systematic,
and subject to some form of validation based on historical experience. There are two important components to this
validation process: the ability to predict defaults and the accuracy of the default predictive measure.
The first criterion implies that a credit measure should be dynamic enough to be a meaningful and timely signal of
deteriorating credit quality or an impending credit event. In this regard, the Basel Accord states: Assessments must be
subject to ongoing review and responsive to changes in financial condition. Before being recognized by supervisors, an
assessment methodology for each market segment, including rigorous back-testing, must have been established for at
least one year. This also means that the credit assessment technology should have the ability to distinguish between
defaulters and non-defaulters. It should not allow defaulters to enter the sample while trying to create a sample of good
quality firms (Type I Error). Conversely, it should not exclude good quality firms from the sample while trying to
exclude potential defaulters (Type II Error).
The second criterion is focused on the accuracy of the credit assessment measure so that it can be useful to banks and
other financial institutions in their efforts toward risk measurement, valuation, and capital allocation. The Basel Accord
states: Banks must have a robust system in place to validate the accuracy and consistency of rating systems processes,
and the estimation of PDs (Probabilities of Default).
The objective of this document is to compare the performance, based on the above validation criteria, of EDF credit
measures with some of the other popular credit assessment approaches. The popular approaches that we consider are the
following:

Agency ratings

RiskCalc U.S. v3.1 private firm model

A Simple Merton structural model

Altmans Z-Score

In this paper we present our test results for three regions: North America, Europe and Asia. The rest of the paper is
organized as follows: Section 2 discusses briefly the credit assessment approaches that we consider in our paper. Section 3
highlights the empirical methodology we follow to compare the approaches. Section 4 presents the results of our tests by
1
region and interprets the economic meaningfulness of these results. Section 5 concludes the paper.

CREDIT RISK ASSESSMENT APPROACHES


The credit risk assessment approaches considered in this paper are:

Moodys KMV EDF credit measures

Agency ratings

Moodys KMV RiskCalc private firm model

Mertons structural model

Altmans Z-Scores2

In the following section we briefly discuss each of the approaches.

Section 4.1 presents the results for North America, section 4.2 presents the results for Europe, and section 4.3 presents the results for
Asia.
2
For reasons explained in the next two sections, not all the approaches can be subjected to tests on all the criteria. We try to include as
many of these approaches as possible in our test of each criterion.

2.1

Moodys KMV EDF Credit Measures

The structural view on credit risk was first made commercially viable with the introduction of the Vasicek-Kealhofer
(VK) model. This model offers a rich framework that treats equity as a perpetual down-and-out option on the
underlying assets of the firm. This framework incorporates five different classes of liabilities: short-term liabilities, longterm liabilities, convertible debt, preferred shares, and common shares. To overcome the regular problems encountered
by structural models due to the assumption of normality, the VK model uses an empirical mapping based on actual
3
default data to get the default probabilities, known as EDF credit measures and offered by Moodys KMV. Volatility is
estimated through a Bayesian approach that combines a comparables analysis with an iterative approach.
EDF credit measures are the outputs of Moodys KMV Credit Monitor and CreditEdge applications. An EDF credit
measure is a quantitative measure of credit quality. More specifically, an EDF credit measure is an estimate of the
physical probability of default for a given firm. For an overview of the EDF credit measure, see Crosbie and Bohn
(2003).
In 2007, Moodys KMV released EDF 8.0, which refines the mapping of the Distance-to-Default to the EDF credit
measure using a much larger default database observed over a longer time period. Details of the new model enhancement
can be found in Dwyer and Qu (2007).
The EDF estimates are now bounded between 0.01% (for an EDF value of 0.01) and 35% (for an EDF value of 35).
Moodys KMV offers a term-structure of EDF credit measures for 1 to 10 years and an extrapolation scheme to get
shorter-term EDF credit measures. The risk free rate used in the calculation of EDF credit measures is now updated
monthly.

2.2

Agency Ratings

Moodys Investors Service, Standard and Poors Corporation, and other well-known rating agencies around the world
have been assigning credit ratings to major borrowers for decades. These are ordinal measures of credit measures (i.e.,
they help rank firms by their quality of credit). These ratings have established international credibility because of the
long history of rating agencies, and the extensive testing of their relative performance.

2.3

Moodys KMV RiskCalc EDF Credit Measures

Moodys KMV RiskCalc is designed to calculate EDF credit measures for private companies. Private companies are
typically smaller than public companies and are not required to file financial statements with SEC.
The RiskCalc model incorporates aspects of both the structural, market-based approach in the form of industry-level
distance-to-default measures, and the localized financial statement-based approach. While it incorporates equity market
information at the aggregate level, RiskCalc does not take advantage of the equity information of the specific company.
We used the RiskCalc v3.1 U.S. model to obtain RiskCalc EDF credit measures for the set of publicly traded companies.
Comparing public firm EDF credit measures to RiskCalc EDF credit measures computed on public firms represents an
out-of-universe test of RiskCalc.

2.4

Mertons Structural Model

The Merton model of risky debt is the original structural model of credit risk, and perhaps the most significant
contribution to the area of quantitative credit risk research. This model assumes that equity is a call option on the value
of assets of the firm. From this insight, the value of debt can be derived based on the observed equity value. The default
event is modeled as the firms asset value falling below a threshold level (i.e., default barrier). Given the default barrier,
and the asset value parameters, the probability of default can be estimated for various horizons. A detailed description of
4
this model can be found in most standard finance textbooks.

3
4

See Eom, Helwege, and Huang (2003) for details of the discussion.
See, for example, Hull (1999).

For our specific tests, the model has been implemented as:
Default Pointi,Merton = Short Term Liabilities + 0.5 Long Term Liabilities
The default probability for a firm i for a time horizon t is computed as:

AVLi
2
ln
+ ( i 0.5 i ) t
Default
Point
i,Merton

PDi =

i t

i = iequity

EVLi EVLi

AVLi AVLi

(1)

(2)

EVLi = AVLi ( d1 ) Default Point i,Merton e rt ( d 2 )

(3)

AVLi
2
ln
+ ( r + 0.5 i ) t
Default
Point
i,Merton

d1 =
i t
d 2 = d1 i t

i , iequity , AVL , and EVL


i

are the asset volatility, equity volatility, asset value and equity value of firm i, respectively.

(x) is the cumulative normal distribution function. i is the drift rate for the asset returns of firm i while r is the

riskless rate of return.

iequity is computed as the standard deviation of three years of weekly equity returns for company
5

i. Asset value AVLi is computed by solving equations (2) and (3) simultaneously.

2.5

Altmans Z-Score

Altmans Z-Score came as a response to the need for identifying the financial health of any business based on observable
accounting and market ratios. This original measure was developed in 1968 by Edward Altman, whose Z-Score is
available in various forms. We chose the public firm form, which includes market capitalization in the leverage ratio, and
calculated Z-Scores as follows:

In contrast to the two equations and two unknowns, we use an iterative approach to solve for empirical volatility which is combined
with modeled volatility in a Bayesian fashion.

Z = ( X 1 + X 2 + X 3 + X 4 + X 5 )

(4)

Where
X 1 = 1.2

CurrentLiabilities
BookAssetValue

is the ratio of Current Liabilities to Total Assets;


X 2 = 1.4

Retained Earnings
Book Asset Value

is the Profitability Ratio;


X 3 = 3.3

Operating Income before Depreciation


Book Asset Value

is the ratio of EBIDTA to Total Assets;

X 4 = 0.6

Market Capitalization
Book Value of Liabilities

is the ratio of Market Value of Equity to Book Value of Liabilities; and


X5 =

Sales
Book Asset Value

is the ratio of Sales to Total Assets.


The calculation typically produces a Z-Score between 5 and 10, with a high Z-Score implying a better credit quality
and lower chance of bankruptcy. Z-Scores are not interpreted directly as default probabilities and therefore work as
ordinal measures of financial health. Therefore, they cannot be used directly for valuation, quantitative risk assessment,
and capital allocation purposes.

EMPIRICAL METHODOLOGY
In this section, we describe the methodology we chose for tests of each criterion.

3.1

Timely Default Prediction

Timeliness measures how many months before impending credit event EDF credit measures give signal of deteriorating
credit quality. To test timeliness, we create a sample of defaulted firms, retaining monthly observations from 24 months
prior to default up to12 months after default. We compute the median EDF credit measure and the median Moodys
rating by months to default. We overlay and compare the median EDF credit measure and the median Moodys rating.
For testing timeliness against rating, we use the Moodys rating. To ensure that the measure has stood the test of time
and the rating grades and size, we also provide the analysis, wherever possible, for the subsets of data based on time
period:

19962000

2001 and beyond

3.2

Default Predictive Power

While a default predictive measure can be timely for warning of impending defaults, it may not be so effective in
distinguishing a good firm from a bad firm. The calibration of the model may be on the conservative side inflating the
default probability of all suspect names, of which some names might not be genuinely distressed. In this case, even
though one could claim that the model performed well in predicting impending defaults, it would be fairly mediocre in
its ability to distinguish good firms from bad firms. One of the essential features of a good model is that it should be
sophisticated enough to differentiate bad (genuinely distressed) firms from good (false alarms) firms. There are two wellknown approaches to testing a model for its power:

Cumulative Accuracy Profile (CAP) with its output known as Accuracy Ratio (AR).

Receiver Operating Characteristic (ROC) with its output known as Area Under Curve (AUC).

Typically, the larger the Accuracy Ratio or Area Under Curve, the better the model. In extreme cases, a totally random
model that bears no information on impending defaults has AR = 0, and AUC = 0.5. For a perfect model,
AR = AUC = 1. The two approaches are equivalent with AR = 2AUC-1. A more detailed discussion can be found in
Appendix A.
In this article, we use the Cumulative Accuracy Profile approach, and provide AR as our output. We compared EDF
credit measures to:

Ratings

RiskCalc EDF credit measure

Simple Merton model

Altmans equity-based Z-Score.

3.3

Level Validation with Default Data

The level validation of EDF credit measures verifies how well the models predicted default rates track realized default
rates. We employ the same methodology described in Bohn, Arora and Korablev (2005) which was first developed in
Kurbat and Korablev (2002). The procedure is summarized into the following four steps:
1.

Using Monte Carlo technique, we simulate asset value movements based on a single factor Gaussian model to
capture correlated defaults.

2.

We determine default/non-default state based on the level of each firms EDF credit measure and each simulation
outcome.

3.

We compare the actual default rate to the median, 10th percentile and 90th percentile of the simulated distribution.

4.

We compute the probability of observing a default rate less than or equal to the realized default rate given the model
and the correlation coefficient.

We extend this methodology by using Bayesian methods to compute the posterior distribution of the aggregate shock
given the realized default rate, the model, and the correlation coefficient. The extension to the original methodology is
developed in Dwyer (2007).

3.3.1 Interpreting the Analytical Outputs for Level Validation


We create two graphs as an output to the level validation test. Figure 1 is the illustrative example of the output, and is
the comparison of the median predicted (by simulation) default rate and realized default rate. The median predicted
default rate is the black line. Red line represents the actual default rate. Fifty percent of the time the actual default rate
should be above (or below) the median. We also show the mean of predicted default rate, which is the blue line. Most of
the time the actual default rate should be below the average predicted default rate. The two gray lines correspond to the
prediction interval which represents the range of variability that is expected in the realized default rates given the EDF

values and the assumed correlation model. This prediction interval implies that eighty percent of the time the realized
6
default rate should lie within the 10th and the 90th percentiles.

The actual default rate should lie within


the 10th and 90th percentile 80% of the
time.
The actual default rate.
The average predicted default rate. Most of
the time the actual default rate should be
below this average.
The median predicted default rate. Fifty
percent of the time the actual default rate
should be above (or below) the median.

FIGURE 1 Illustrative example of the level validation output. Comparison of median


predicted default rate and realized default rate.

This prediction interval differs from the concept of a confidence interval. An x% confidence interval is random interval for which
the probability of it holding the true value of a parameter is x%. In our context here, an x% prediction interval has the interpretation
that x% of the time the realized default rate will be within this range given the EDFs levels and the correlation model.

10

P-value measures the probability


of observing a default rate at or
lower than the actual default rate

Median value of the aggregate shock


given the actual default rate

FIGURE 2 Illustrative example of the level validation output. Posterior distribution


of the aggregate shock and P-value of the actual default rate
The figure depicts the posterior distribution for the aggregate shock that was derived given the realized default rate, the model and the
correlation coefficient. We also computed the P-value of the actual default rate, which is the probability of observing a default at or
lower than the actual default rate. This P-value is shown as a blue line.

3.4

Level Validation with CDS Data

This test analyzes the level bias in European EDF credit measures relative to that of U.S. EDF credit measures. The
rationale for the test is based on the assumption that similar risks should offer similar premium in the U.S. and Europe.
We compare the median as well as 25th and 75th percentile CDS levels of two regions: U.S. and Europe across
EDF-implied rating groups. The same EDF categories should have same aggregate median spreads in CDS market across
two regions. We used Mark-It composite CDS data from January 2003 to December 2006. The Europe region is based
on the following currency information: Euro, Austrian Schilling, Belgian Franc, Swiss Franc, Czech Republic Koruna,
Deutsche Mark, Danish Kroner, Spanish Peseta, Finnish Markka, French Franc, Greek Drachmae, Hungarian Forint,
and British Pound. The U.S. region is based on the U.S. dollar.

3.5

Median EDF by Rating Category across Regions

We calculate and compare median EDF credit measures for North American non-financial companies, Asian-Pacific
non-financial companies, European non-financial companies and global financial companies by several rating categories.
In the absence of other measures of credit risk, e.g., spreads or defaults, a comparison with rating provides a sanity check
on the rank ordering of risk produced by the EDF credit measure and the comparableness of level of the EDF credit
measure across geographies.

11

EMPIRICAL RESULTS
In this section, we describe empirical results.

4.1

North America

In this section, we describe empirical results obtained in North America. Results are separated into U.S. and North
American companies that are headquartered outside of the U.S. These companies are predominantly headquartered in
Canada, Bermuda and the Cayman Islands.

4.1.1 Data
We start with all U.S. firms that have publicly traded equity from 19962006, unless otherwise specified. We restrict the
7
sample to non-financial firms with more than $30 million in size. For level validation we impose further restriction of
$300 million in size.
We also present results for comparable North American firms that are outside of the U.S. (Canada, Bermuda, Cayman
Islands, Bahamas, Belize, Panama, Virgin Islands, and Netherlands Antilles). Table 1 shows the countries and the
number of firm-months in each country that constitute North American module in Credit Monitor and CreditEdge.
Outside of the U.S., the largest countries are Canada, Bermuda and the Cayman Islands.
TABLE 1

Countries in the North American Database


Country

Number of Observations
(firm-month)

Netherlands Antilles

776

Bahamas

440

Belize
Bermuda
Canada

85
3,552
153,971

Cayman Islands

975

Panama

245

USA
Virgin Islands

1,127,452
491

For all comparison against ratings, we used Moodys ratings.


Defaults are based on the Moodys KMV Default database and include missed payments, distressed exchanges, and
insolvency proceedings. The defaults have been collected on a daily basis for more than ten years using a variety of
8
printed and on-line sources. By the end of 2006, we had about 7,900 public defaults worldwide. About 5,600 defaults
were from North America.

Size is measured by the sales of the firm for non-financial firms. Wherever the firms total sales number was not available, we used
the book asset value of the firm. This number was further adjusted for inflation effect across years by adjusting the numbers to a
common denomination by using a deflation adjustor calculated internally at Moodys KMV.
8
To collect defaults, we use numerous printed and online sources from around the world on a daily basis. We use government fillings,
government agency sources, company announcements, news services, specialized default news sources and even sources within
financial institutions to ensure to the greatest extent possible that we find all defaults. We also keep evidences in electronic format so
that content can be easily verified. As a result, Moodys KMV has the most extensive default database for public firms.

12

4.1.2 Timely Default Prediction U.S.


In this section, we compare the performance of EDF credit measures against agency ratings in their ability to predict
timely defaults. Figure 3 demonstrates how the median EDF credit measure (represented by the solid black line) starts
rising 24 months before the actual default, while the median Moodys rating stays flat until 13 months before default,
and then shows a steep rise about 5 months before default. In that sense, the EDF credit measures seem to lead the
ratings. This is also helped further by the fact that the EDF credit measure is more continuous, and therefore one can see
a steady and continuous rise in the aggregate. Ratings, on the other hand, are discrete, and therefore one sees a step-like
function with flat stretches implying that this measure does not instantaneously pick up the most currently available
information.
To test for the robustness of the results, we further divided our data into the subperiods:

1996-2000

2001-2006

The period 19962000 is shown on the left panel of Figure 4, and the period 20012006 is shown on the right panel of
Figure 4. Both EDF credit measures and ratings start at a higher level 24 months prior to default in the latter half of the
sample. EDF credit measures continued to lead the agency rating in each subperiod, indicating that EDF credit measures
indeed provide a more timely warning of impending defaults.

EDF measure is
leading rating by
11 months

FIGURE 3 Comparison of median agency ratings with Moodys KMV EDF values for rated defaulted firms
in the U.S. from 2 years before default to 1 year after default between 1996 and 2006

13

FIGURE 4 Comparison of median agency ratings with Moodys KMV EDF values for rated defaulted firms
in the U.S. from 2 years before default to 1 year after default for subsamples: 19962000 (left panel)
and 20012006 (right panel)

4.1.3 Default Predictive Power U.S.


In this section, we compare the performance of EDF credit measures against agency ratings, Z-Scores, and a simple
Merton model in its ability to discriminate between good and bad firms. Our test statistic is the Accuracy Ratio as
defined earlier. We also show the plots of Cumulative Accuracy Profiles of these measures for various subsamples selected
using different horizons and size filters.
EDF Credit Measure vs. Agency Rating
Figure 5 shows the performance of EDF credit measures against ratings on the entire sample period of 19962006. By
design, this test is restricted to the sample of rated firms only. It is clear that the EDF credit measure performs better
than ratings on the entire sample period with their Accuracy Ratios at 0.88 and 0.75, respectively.
To ensure that the measure is robust in its performance across various time horizons, we divide our sample into two
subsets of data based on time periods:

19962000

20012006

We provide the analysis by three different size categories:

Size is greater than $30 million

Size is between $30 and $300 million

Size is greater than $300 million

14

FIGURE 5 Cumulative Accuracy Performance (CAP) curves comparing Moodys KMV EDF credit
measures and agency ratings for U.S. non-financial companies between 1996 and 2006. The Accuracy
Ratios for EDF measure and agency rating are 0.88 and 0.75, respectively.
Table 2 illustrates the results for the subsamples. We find that the EDF credit measure substantially outperforms ratings,
in all categories by at least 12%.
TABLE 2 Accuracy Ratios by category for EDF Credit Measures and
agency ratings for U.S. non-financial companies
EDF Credit
Measure

Ratings

19962006

0.88

0.75

19962000

0.87

0.75

20012006

0.88

0.75

19962006,
Size > $30 Million

0.88

0.75

19962006,
Size $30-$300 Million

0.75

0.57

19962006,
Size> $300 Million

0.89

0.76

Date

15

We also calculated Accuracy Ratios at the horizons longer than one year. The results are presented in Table 3. EDF
credit measures have more discriminatory power than agency ratings at all horizons, but the difference is smaller at
longer horizons.
TABLE 3 Accuracy Ratios of one- to five-year EDF credit measures and agency ratings
for U.S. non-financial companies between 1991 and 2006
EDF Credit
Measure

Ratings

Number of
Observations

Number of
Defaults

One-year EDF
credit measure

0.88

0.76

2031

354

Two-year EDF
credit measure

0.81

0.73

1926

374

Three-year EDF
credit measure

0.77

0.71

1917

385

Four-year EDF
credit measure

0.72

0.7

1892

400

Five-year EDF
credit measure

0.69

0.68

1850

404

The Accuracy Ratios (AR) for both the EDF credit measure and agency rating decreases with horizon. The difference between ARs
becomes more compressed at longer horizons.

Figure 6 and Figure 7 present the Accuracy Ratios for the EDF credit measure and agency rating by year at one- and
9
five-year horizons respectively. For each year, we used the EDF credit measure as of the last market day of the prior year
to predict default during the next one or five years.
At a one-year horizon, the EDF credit measure has better discriminatory power than agency rating in all years, except
1996, which had the least number of defaults. At a five-year horizon, the EDF credit measure also outperforms agency
rating in all years except 2000.

The numbers underlying Figures 6 and 7 are summarized in Tables 15th and 16th of Appendix B.

16

1.00

0.90

0.80

0.70

0.60

0.50
1996

1997

1998

1999

2000

2001

2002

1-Year EDF Credit Measure

2003

2004

2005

2006

Agency Rating

FIGURE 6 Accuracy Ratios for EDF credit measures and agency ratings for U.S. non-financial companies
by year at the one-year horizon

1.00

0.90

0.80

0.70

0.60

0.50
1991 1992

1993 1994 1995

1996 1997

1998 1999 2000

5-Year EDF Credit Measure

2001 2002

Agency Rating

FIGURE 7 Accuracy Ratios for EDF Credit Measures and agency ratings for U.S. non-financial
companies by year at the five-year horizon
EDF Credit Measure vs. Merton Default Probability and Z-Score
In this section we compare the performance of EDF credit measures to the Merton models implied default probabilities
and Z-Scores as described in Section 2. The sample period used is between 1996 and 2006. Unlike the rated firms,
which are usually larger and higher profile, some of the unrated firms can be very small and their defaults can go
unnoticed. In some cases, there can be some informal negotiations or bailouts, avoiding the default. These cases are likely

17

10

to contaminate our results. Therefore we filtered out very small firms (size < 30 million dollars) from our sample. For
the entire period 19962006, the results are shown in Figure 8. The results are presented on a joined sample of Z-Scores,
Merton default probabilities, and EDF credit measures, which require each of these values to be non-missing.
We find that the EDF credit measure substantially outperforms Merton default probability and Z-Score in terms of their
ability to discriminate good firms from bad firms with their Accuracy Ratios at 0.82, 0.72, and 0.66 respectively. We
further divide the sample into subsets of sizes 30 million dollars to 300 million dollars, and 300 million dollars and
above. In both cases, the EDF credit measure outperforms the Merton model and Z-Score, as shown in Table 4.
Once again, as a robustness check, we compared the performance of the two measures across the time horizons
1996-2000, and 20012006. The results are shown in Table 4. As expected, our results are fairly robust with EDF credit
measures outperforming Merton default probabilities and Z-Scores across both horizons.

FIGURE 8 Cumulative Accuracy Performance (CAP) curves comparing Moodys KMV EDF credit measures,
Merton default probability and Z-Scores for U.S. non-financial companies between 1996 and 2006.The
Accuracy Ratios for EDF measure, Merton Default Probability and Z-Score are 0.82, 0.72 and 0.66
respectively.

10

Size is measured by the sales of the firm for non-financial firms. Whenever the firms total sales number was not available, we used
the book asset value of the firm. This number was further adjusted for inflation effect across years by adjusting the numbers to a
common denomination by using a deflation adjustor calculated internally at Moodys KMV.

18

TABLE 4 Summary of Accuracy Ratios across various size buckets and time horizons for EDF
credit measure, Merton default probability, and Z-Score for U.S. non-financial companies
EDF Credit
Measure

Z-Score

Merton Default
Probability

1996-2006,
Size >$30Mln

0.82

0.66

0.72

1996-2000,
Size >$30Mln

0.82

0.66

0.73

2001-2006,
Size >$30Mln

0.82

0.67

0.71

1996-2006,
Size $30-$300 Million

0.76

0.65

0.67

1996-2006,
Size> $300 Million

0.88

0.66

0.77

Date/Size

EDF Credit Measure vs. RiskCalc EDF Credit Measure


In this section we compare the performance of EDF credit measures to RiskCalc EDF credit measures calculated for
Public firms as described in Section 2. The sample period used was 19962006. As before, we filtered out very small
11
firms (size < 30 million dollars) from our sample. For the entire period 19962006, the results are shown in Figure 9.
We find that EDF credit measures have more discriminatory power than RiskCalc EDF credit measures, which we
expected because RiskCalc does not incorporate firm-specific equity market information. Their Accuracy Ratios are at
0.82 and 0.68 respectively. We further divided the sample into subsets of sizes of 30 million dollars to 300 million
dollars, and 300 million dollars and above. In both cases, EDF credit measures outperform RiskCalc EDF credit
measures, as shown in Table 5. Both measures perform better for larger firms.
Once again, as a robustness check, we compared the performance of the two measures across the time horizons
1996-2000, and 20012006. The results are presented in Table 5. As expected, our results are fairly robust with the
EDF credit measures outperforming the RiskCalc EDF credit measures across both horizons. The Accuracy Ratio of the
EDF credit measure is higher in the second period while Accuracy Ratio of the RiskCalc EDF stays the same.

11

Size is measured by the sales of the firm for non-financial firms. Wherever the firms total sales number was not available, we used
the book asset value of the firm. This number was further adjusted for inflation effect across years by adjusting the numbers to a
common denomination by using a deflation adjustor calculated internally at Moodys KMV.

19

FIGURE 9 Cumulative Accuracy Performance (CAP) curves comparing Moodys KMV EDF credit
measures and RiskCalc EDF credit measures between 1996 and 2006 for U.S. non-financial
companies. The Accuracy Ratios for EDF measure and RiskCalc EDF measure are 0.82 and 0.68
respectively.

20

TABLE 5
Summary of Accuracy Ratios for EDF Credit Measures and RiskCalc EDF Credit
Measures for U.S. non-financial companies by different size buckets and time periods
EDF Credit
Measure

RiskCalc EDF
Credit Measure

1996-2006,
Size >$30 Million

0.82

0.68

1996-2000,
Size >$30 Million

0.81

0.68

2001-2006,
Size >$30 Million

0.83

0.68

1996-2006,
Size $30-300 Million

0.76

0.64

1996-2006,
Size>$300 Million

0.89

0.72

Date / Size

The EDF credit measure effectively discriminates between good and bad credits. It performed better than Z-Score,
RiskCalc for private firms applied for publics, and simple implementation of a Merton model. It leads rating changes in
predicting defaults and it performs well across multiple cuts of the data and multiple horizons.

4.1.4 Accuracy of Levels U.S.


The test for this criterion draws from the methodology used by Korablev and Kurbat (2002), and Bohn, Arora and
Korablev (2005), which is described in Section 3. We also extended this methodology by using Bayesian methods to
compute the posterior distribution of the aggregate shock given the realized default rate, the model and the correlation
coefficient as described in Dwyer (2007).
The other alternatives of credit risk measurement cannot be directly interpreted as physical default probabilities, or
provide a framework that can account for the underlying correlations between assets. Therefore they cannot be compared
12
against EDF credit measures for the level test. Secondly, we have issues of hidden defaults or missing defaults for
smaller firms, as explained in Kurbat and Korablev (2002). Therefore, consistent with that paper, we restrict this test to
firms of size 300 million and above.
We first present results broken down by coarser levels of the EDF credit measure, then repeat the analysis for narrower
ranges of the measure.
Results for Firms with EDF Values Below 35%
In the previous validation studies (Kurbat and Korablev (2002), Bohn, Arora and Korablev (2004)), the test was
performed on the EDF 7.1 model, which was capped at 20%. In that case the predicted number of defaults was likely to
underestimate the realized number of defaults due to the truncation effect. Therefore we divided our sample into two:
EDF credit measures less than 20% and EDF credit measures equal to 20%.
One of the main features of the EDF 8.0 model is the new cap of 35%. Now we can expect that the truncation effect
would lessen or even disappear. Nevertheless, to be consistent with the previous studies we decided to split the sample
into two: firms with EDF values less than 35% (3500 bps) and firms with EDF values equal to 35%. The comparison
for the sample of firms with EDF values less than 35% is shown in Figure 10. The left panel of Figure 10 displays mean,
median predicted (by simulation) and actual default rate for EDF values below 35% along with 80% confidence set for
the predicted default rate. We used an asset correlation of 0.19 to simulate defaults in each year. The right panel of the

12

The exception to this is the Merton model but the default probabilities are too low as implied by the Merton model, and therefore it
would usually underestimate the predicted number of defaults.

21

Figure 10 presents the posterior distribution for the aggregate shock given the actual default rate and P-values of the
actual default rate, which is the probability of observing a default at or lower than the actual default rate.
The predicted default rate clearly tracks the realized default rate very well. All predicted default rates fall within the
confidence set. The exception is year 2003, which was an uncharacteristically good year for the economy leading to a
substantially lower number of defaults. In year 2003, to explain the low default rate, we estimate that the U.S. economy
received a positive 0.84 standard deviation shock relative to market expectations. Such a positive shock is consistent with
the high returns on the S&P 500 observed during that year. The P-values of the realized default rate range from 21% to
75%, which is within the sampling variability that would be expected.

FIGURE 10

Comparison of median predicted default rate with the realized default rate, 19912006

The sample was restricted to U.S. firms larger than 300 million dollars and EDF credit measure less than 35%. We used an asset
correlation of 0.19 to simulate defaults in each year. On the left panel, the gray lines represent the 80% prediction interval for realized
default rate, the black line is the median predicted default rate, the blue line is the mean predicted default rate, and the red dotted line
is the realized default rate. The right panel shows the aggregate shock distribution and P-values. The dark black line, rm50 is the
median for the posterior distribution of the aggregate shock; the grey lines, rm10 and rm90 are the 10th and 90th percentiles; the blue
line is the P-value of the actual default rate, which is the probability of observing a default at or lower than the actual default rate.

We summarize the numbers that underlie Figure 10 in Tables 6 and 7. Table 6 contains the number of firms, number of
defaults, median and mean predicted default rate per year as well as the 10th and 90th percentiles for predicted default
rate. It is clear from this table that the correlation effect skews the distribution of default rates to the left. If we ignored
this effect and had simply taken the mean default rate of the sample, we would have grossly over-predicted the realized
default rate. Table 7 contains the median aggregate shock, the 10th and 90th percentiles of the aggregate shock, and the
p-value by year.

22

TABLE 6

Comparison of mean and median predicted default rate with the realized
default rate between 1991 and 2006

Mean
Predicted
Default
Rate

Median
Predicted
Default
Rate

1991

2.3%

1.7%

1992

1.4%

1993

Realized
Default
Rate

10th
percentile

90th
percentile

2.5%

0.5%

4.9%

1554

39

1.0%

1.0%

0.2%

3.2%

1549

15

1.3%

0.9%

0.9%

0.2%

2.9%

1639

15

1994

1.1%

0.7%

0.6%

0.1%

2.5%

1775

10

1995

1.2%

0.7%

0.9%

0.1%

2.6%

1847

16

1996

1.2%

0.8%

0.9%

0.2%

2.8%

1906

17

1997

1.2%

0.8%

0.8%

0.2%

2.6%

2054

17

1998

1.1%

0.7%

0.9%

0.1%

2.4%

2114

20

1999

1.8%

1.2%

1.0%

0.3%

3.9%

2106

22

2000

2.6%

1.9%

1.9%

0.5%

5.5%

2042

38

2001

3.6%

2.8%

2.7%

0.8%

7.3%

1783

48

2002

2.5%

1.9%

1.8%

0.5%

5.4%

1707

31

2003

3.0%

2.3%

1.0%

0.6%

6.2%

1635

16

2004

1.2%

0.8%

0.7%

0.2%

2.8%

1699

12

2005

0.8%

0.5%

1.0%

0.1%

1.9%

1806

18

2006

0.7%

0.4%

0.2%

0.1%

1.5%

1835

Year

Firms

Defaults

The sample was restricted to U.S. firms larger than 300 million dollars with EDF credit measures less than 35%.

23

TABLE 7 Summary table of aggregate shock and year-wise probability of realizing the
actual number of defaults between 1991 and 2006
Year

10th Percentile

Median
Aggregate Shock

90th Percentile

Probability of having
actual defaults or
even lower

1991

-0.64

-0.42

-0.19

68.7%

1992

-0.28

0.03

0.34

51.7%

1993

-0.37

-0.07

0.23

57.9%

1994

-0.16

0.19

0.52

47.3%

1995

-0.42

-0.13

0.15

58.9%

1996

-0.35

-0.06

0.23

54.7%

1997

-0.35

-0.06

0.22

57.7%

1998

-0.52

-0.25

0.01

64.2%

1999

-0.11

0.15

0.40

47.7%

2000

-0.20

0.02

0.23

51.3%

2001

-0.17

0.04

0.24

49.5%

2002

-0.21

0.03

0.26

52.0%

2003

0.54

0.84

1.13

21.3%

2004

-0.20

0.13

0.45

51.4%

2005

-0.86

-0.58

-0.31

74.5%

2006

-0.02

0.46

0.92

45.1%

The sample was restricted to U.S. firms larger than 300 million dollars with EDF credit measures less than 35%.

Results for Firms with EDF Values Equal to 35%


Figure 11 shows the median predicted and actual number of defaults for EDF credit measures of 35%. We used 0.181 as
an asset correlation for pairs of firms in each year to simulate defaults. The companies in this sample are, on average,
somewhat less correlated with each other than the set of firms with EDF credit measures of less than 35%. We find that
the realized default rate ranges from 11% to 67%. The high default rate in 1998 is indicative of a large negative shock
which is shown in Figure 11 along with the P-values of the realized default rate. The P-values range from 8% to 93%,
which is within the sampling variability that would be expected over a 15-year period.

24

FIGURE 11

Comparison of median predicted default rate with the realized default rate, 19912006

The sample was restricted to U.S. firms larger than 300 million dollars and EDF credit measure equal to 35%. We used an asset
correlation of 0.181 to simulate defaults in each year. On the left panel, the gray lines represent the 80% prediction interval for
realized default rate, the black line is the median predicted default rate, the blue line is the mean predicted default rate, and the red
dotted line is the realized default rate. The right panel shows the aggregate shock distribution and P-values. The dark black line, rm50
is the median for the posterior distribution of the aggregate shock; the grey lines, rm10 and rm90 are the 10th and 90th percentiles; the
blue line is the P-value of the actual default rate, which is the probability of observing a default at or lower than the actual default rate.

We summarize the numbers that underlie Figure 11 in Tables 8 and 9. Table 8 contains the number of firms, the
number of defaults, the median and mean predicted default rate per year, as well as the 10th and 90th percentiles for the
predicted default rate. It is clear from this table that the correlation effect skews the distribution of default rates to the
left. Table 9 contains the median aggregate shock, the 10th and 90th percentiles of the aggregate shock, and the P-value by
year.

25

TABLE 8

Comparison of mean and median predicted default rate with the realized
default rate between 1991 and 2006

Mean
Predicted
Default
Rate

Median
Predicted
Default
Rate

1991

35.0%

33.4%

1992

35.0%

1993

Realized
Default
Rate

10th
Percentile

90th
Percentile

40.0%

12.7%

59.7%

30

12

33.4%

24.0%

12.3%

60.3%

25

35.0%

33.5%

33.3%

11.8%

60.9%

21

1994

35.0%

33.5%

11.8%

11.2%

61.9%

17

1995

35.0%

33.7%

15.4%

10.4%

63.5%

13

1996

35.0%

33.6%

12.5%

11.0%

62.2%

16

1997

35.0%

34.2%

11.1%

9.2%

67.1%

1998

35.0%

33.6%

66.7%

10.8%

62.6%

15

10

1999

35.0%

33.4%

35.1%

13.1%

59.2%

37

13

2000

35.0%

33.5%

38.3%

13.6%

58.8%

47

18

2001

35.0%

33.5%

40.2%

14.5%

57.8%

107

43

2002

35.0%

33.5%

41.0%

13.9%

58.4%

61

25

2003

35.0%

33.5%

38.0%

14.1%

58.2%

71

27

2004

35.0%

33.4%

23.1%

12.4%

60.1%

26

2005

35.0%

33.5%

10.0%

11.7%

61.1%

20

2006

35.0%

33.5%

16.7%

11.4%

61.6%

18

Year

Firms

The sample was restricted to U.S. firms larger than 300 million dollars and EDF credit measure equal to 35%.

26

Defaults

TABLE 9

Year

Summary table of aggregate shock and year-wise probability of realizing the actual
number of defaults between 1991 and 2006
Median
Aggregate Shock

10th Percentile

The probability of
having actual defaults
or even lower

90th Percentile

1991

-0.86

-0.30

0.24

63.1%

1992

-0.21

0.44

1.06

30.6%

1993

-0.65

0.00

0.63

50.0%

1994

0.13

0.94

1.76

10.8%

1995

-0.14

0.69

1.54

17.0%

1996

0.07

0.88

1.71

12.0%

1997

-0.21

0.72

1.68

12.4%

1998

-1.96

-1.23

-0.53

92.8%

1999

-0.60

-0.08

0.42

53.8%

2000

-0.71

-0.24

0.22

60.2%

2001

-0.68

-0.35

-0.04

64.4%

2002

-0.79

-0.38

0.03

65.7%

2003

-0.63

-0.24

0.15

60.0%

2004

-0.15

0.49

1.11

28.7%

2005

0.30

1.08

1.89

8.0%

2006

-0.03

0.73

1.48

17.9%

The sample was restricted to U.S firms larger than 300 million dollars with EDF credit measures equal to 35%.

Results by EDF Subgroups


To test the robustness of our results, we further divide the sample of firms with EDF values less than 35% into smaller
groups. EDF buckets that we used along with correlation for default simulation in each bucket are presented in
Table 10.
TABLE 10
Stratum

EDF buckets

EDF Range

Correlation

Number
of Firms

0.015-

0.191

22887

0.177

1264

0.192

442

512
1235

Figures 12, 13, and 14 show the median, mean, and the prediction interval for the realized default rate and actual default
rate for EDF values in the range [0.02, 5), [5,12), and [12,35), respectively. It is clear from these figures that while the
predicted and realized default rates can deviate from each other in certain years, there is no substantial bias in their levels
over the long run. In general, the two levels track each other very well. All predicted default rates fall within the

27

prediction interval. Year 2003 was an uncharacteristically good year for the economy leading to a substantially lower
number of defaults in two of the three subgroups.

FIGURE 12

Comparison of median predicted default rate with the realized default rate, 1991- 2006

The sample was restricted to U.S. firms larger than 300 million dollars and EDF credit measure between 0.01% and 5%. We used an
asset correlation of 0.191 to simulate defaults in each year. On the left panel, the gray lines represent the 80% prediction interval for
realized default rate, the black line is the median predicted default rate, the blue line is the mean predicted default rate, and the red
dotted line is the realized default rate. The right panel shows the aggregate shock distribution and P-values. The dark black line, rm50
is the median for the posterior distribution of the aggregate shock; the grey lines, rm10 and rm90 are the 10th and 90th percentiles; the
blue line is the P-value of the actual default rate, which is the probability of observing a default at or lower than the actual default rate.

FIGURE 13

Comparison of median predicted default rate with the realized default rate, 1991 - 2006

The sample was restricted to U.S. firms larger than 300 million dollars and EDF credit measure between 5% and 12%. We used an
asset correlation of 0.177 to simulate defaults in each year. On the left panel, the gray lines represent 80% prediction interval for
predicted default rate, the black line is the median predicted default rate, the blue line is the mean predicted default rate, and the red
dotted line is the realized default rate. The right panel shows the posterior distribution of the aggregate shock and P-values. The dark
black line, rm50 is the median for the posterior distribution of the aggregate shock; the grey lines, rm10 and rm90 are the 10th and 90th

28

percentiles; the blue line is the P-value of the actual default rate, which is the probability of observing a default at or lower than the
actual default rate.

FIGURE 14

Comparison of median predicted default rate with the realized default rate, 1991- 2006

The sample was restricted to U.S. firms larger than 300 million dollars and EDF credit measure between 12% and 34.99%. We used
an asset correlation of 0.192 to simulate defaults in each year. On the left panel, the gray lines represent the 80% prediction interval
for realized default rate, the black line is the median predicted default rate, the blue line is the mean predicted default rate, and the red
dotted line is the realized default rate. The right panel shows the aggregate shock distribution and P-values. The dark black line, rm50
is the median for the posterior distribution of the aggregate shock; the grey lines, rm10 and rm90 are the 10th and 90th percentiles; the
blue line is the P-value of the actual default rate, which is the probability of observing a default at or lower than the actual default rate.

4.1.5 Timely Default Prediction Outside the U.S.


The Timeliness test outside the U.S. produces very similar results to those in the U.S. The median EDF credit measure
starts rising 24 months before the actual default, while the median rating rises 18 months before default from B2 to B3,
then stays flat until six months before default at which point it rises sharply. EDF credit measures clearly lead ratings.

29

FIGURE 15
Comparison of median agency ratings with Moodys KMV EDF values for defaulted firms
from two years before default to one year after default for North American companies outside the
U.S. and sample period between 1996 and 2006

4.1.6 Default Predictive Power Outside the U.S.


In this section, we compare the performance of Moodys KMV EDF credit measures Z-Scores and a simple Merton
model in its ability to discriminate between good and bad firms. We do not perform a power test against Agency Rating
because of the small number of rated defaults outside the U.S.
EDF Credit Measure vs. Merton Default Probability and Z-Score
In this section we compare the performance of EDF credit measures to the Merton model and Z-Scores as described in
Section 2. The sample period used was 19962006. We filtered out very small firms (size < 30 million dollars) from our
13
sample as we did in the case of U.S. companies. Results for the entire period 19962006, are shown in Figure 16. The
results are presented as a sample of Z-Scores, Merton Default Probabilities, and EDF credit measures. All three values
should be non-missing to be included in the sample.
We find that the EDF credit measure outperforms Merton Default Probability and Z-Score as a more effective statistic
to discriminate good firms from bad firms with their Accuracy Ratios at 0.78, 0.70 and 0.65, respectively. Because of the
sample size, we do not divide the sample into two subsamples as we did in the U.S.

13

Size is measured by the sales of the firm for non-financial firms. Whenever the firms total sales number was not available, we used
the book asset value of the firm. This number was further adjusted for inflation effect across years by adjusting the numbers to a
common denomination by using a deflation adjustor calculated internally at Moodys KMV.

30

FIGURE 16
Cumulative Accuracy Performance (CAP) curves comparing EDF credit measures,
Merton Default Probability and Z-Scores between 1996 and 2006 for North American companies
outside the U.S. The Accuracy Ratios for the EDF credit measure, Merton Default Probability and
Z-Score are 0.78, 0.70 and 0.65 respectively.

4.1.7 Accuracy of Levels Outside the U.S.


Figure 17 presents the level validation results for the sample of firms with EDF credit measures below 35%. The left
panel of the Figure 17 displays mean, median predicted, and actual default rate as well as 80% confidence set for
predicted defaults. We used an asset correlation of 0.19 to simulate defaults in each year. The right panel of Figure 17
displays the posterior distribution for the aggregate shock given the actual default rate and P-values of the actual default
rate, which is the probability of observing a default at or lower than the actual default rate.
Predicted default rate tracks the realized default rate very well. Realized default rate fluctuates around median predicted
default rate. In all years, except 1991, predicted default rates fall within the confidence set. Year 1991 was a good year,
leading to a lower number of defaults. In year 1991, to explain the low default rate, we estimate that the U.S. economy
received a positive 0.82 standard deviation shock relative to market expectations. P-values are between 8% and 75%
which is in the range we expect over 15-year period.

31

FIGURE 17

Comparison of median predicted default rate with the realized default rate, 19912006

The sample was restricted to North American firms outside the U.S. larger than 300 million dollars and EDF credit measure less than
35%. We used an asset correlation of 0.19 to simulate defaults in each year. On the left panel the gray lines represent 80% prediction
interval for the realized default rate, the black line is the median predicted default rate, the blue line is the mean predicted default rate
and red dotted line is the realized default rate. The right panel shows the posterior distribution of the aggregate shock and P-values.
The dark black line, rm50 is the median of the aggregate shock; and the grey lines, rm10 and rm90, are the 10th and 90th percentiles
for the aggregate shock; the blue line is the P-vale of the actual default rate, which is the probability of observing a default at or lower
than the actual default rate.

4.1.8 Conclusion
Results obtained for the North American sample show that the EDF credit measure leads the agency rating in timely
default prediction. The EDF credit measure leads other alternative measures in its ability to discriminate good firms from
bad firms over time and across various subsections of the data. We also showed that the model predicted default rates
track realized default rates well and the model works well not only in the U.S., but also in North America excluding the
U.S.

4.2

Europe

In this section, we describe the results obtained in Europe.

4.2.1 Diversity in Bankruptcy Mechanisms and Creditor Protection


Bankruptcy mechanisms can differ between regions. For example, Davydenko and Franks (2005) found that while the
British bankruptcy mechanism is designed to be extremely creditor friendly, the French system is geared toward
14
protecting a business as a going concern even at the expense of its creditors. While interpreting the validation results,
it is important to understand the impact of these mechanisms on the outcome of the model. For example, if a system is
too creditor-friendly, the creditors can pressure the firm at the slightest hint of distress. This action may cause a firm to
file for bankruptcy sooner, although the recovery for creditors may be higher. On the other hand, if the system is too
geared toward protecting a firm, the creditors may not be allowed to take a firm to court even if it is in severe distress.

14

A brief description of the similarities and differences among British, French, and German bankruptcy mechanisms is provided in
Korablev (2005).

32

A second characteristic is the nature of debt in an economy. A creditor-debtor relationship might be close (as in Japan),
or at arms length (as in the U.S.). If the creditors are few and have a close relationship with the debtor, they are more
likely to evaluate the long-term potential of the debtor before taking it into bankruptcy. If the creditors are scattered,
there is a higher likelihood of a free-rider problem, leading to a forced bankruptcy even if the debtor may have some
long-term positive potential.
In general, we see an equal contribution of non-bankruptcy defaults and bankruptcies in North America, while the
15
European cases of distress are dominated by bankruptcies as shown in Figure 18. This may be influenced by two
factors. First, in many economies within Europe, the debt is held more closely relative to that in the U.S., making it
more likely to enter private renegotiations of debt and avoid default during times of a liquidity crunch. Second, many
cases of defaults may not be covered by the media, and are in that sense hidden. These two factors should not be
applicable to larger firms because their debt is usually widely held, and they are followed more closely by media.
Figure 19 compares the percentage representation of default cases in Europe and North America by size over the period
of 19962006. Defaults as a fraction of total distress cases are substantially smaller in Europe for small and mid-sized
firms. Larger firms, however, have more comparable default behavior across Europe and North America. This shows that
the model validation is more reliable on the sample of large companies because of the quality of data on actual defaults.
North America

Europe

100%

100%

80%

80%

60%

60%

40%

40%

20%

20%
0%

0%
1996

1997

1998

1999

2000

2001

Bankruptcy

FIGURE 18

2002

2003

2004

2005

1996

2006

1997

1998

1999

2000

2001

Bankruptcy

Defaults

2002

2003

2004

2005

2006

Defaults

Percentage representation of defaults and bankruptcies in North American


and European Markets between 1996 and 2006
North America

Europe

80%

70%

70%

60%

60%

50%

50%

40%

40%

30%

30%

20%

20%

10%

10%

0%

0%
1996

1997

1998

1999

Size < 30 Million

FIGURE 19

2000

2001

2002

2003

30 Million <= Size <= 300 Million

2004

2005

Size > 300 Million

2006

1996

1997

1998

1999

Size < 30 Million

2000

2001

2002

2003

30 Million <= Size <= 300 Million

2004

2005

2006

Size > 300 Million

Default events as a percentage of all distress cases across three size buckets
between 1996 and 2006

15

The following events constitute non-bankruptcy defaults: missed interest or principle payment, distressed extension of a loan,
distressed exchange offer, delay in paying substantial portion of trade debt, and government takeover of financial institution to prevent
market collapse.

33

The success of a model relies on the ability of the inputs to take regional nuances into account. A model whose inputs are
not universal in concept may have more difficulty capturing the differences in characteristics of the system in which it is
being implemented. As long as the economic fundamentals of a model are universal in nature, it is not necessary to
interpret its output differently across different regions. For the Moodys KMV EDF model, one of the main drivers is
asset value, which is inferred from the equity value and an underlying structural framework. The model should work well
for data from individual regions and for data pooled across them because the equity markets take into account the
regional differences.
The extent to which different equity markets accurately reflect firm value and volatility has implications for the power
and the level performances of the model. In fact, even if a model is powerful in discriminating defaulters from
non-defaulters in different regions, but is off in its level performance, the aggregation of data across regions will make the
model seem less powerful. For example, if a distance-to-default (DD) of 2 corresponds to an EDF credit measure of 5%
in the U.K., but 2% in France, then an aggregation of data would incorrectly suggest that both a U.K. firm and a French
firm with a DD of 2 correspond to the same rank in our test. In that sense, a default predictive power test on a dataset
aggregated across different regions essentially tests a joint hypothesis that the model is powerful and that the DD-to-EDF
mapping is similar across different regions. It could be the case that the model might be powerful in two regions
separately, but may appear less powerful if the data are aggregated.
Similarly, while testing for levels, one could imagine that the model had specified levels in two regions incorrectly,
overestimating the default rate in one region and underestimating it in the other. However, it may work well on the
aggregated dataset. Therefore, a reasonable level performance on aggregated data is a necessary, but not a sufficient, test
for the level performance of the model in each region. Unfortunately, there is an insufficient number of defaults available
to perform a reliable level test in each subregion of Europe.

4.2.2 Data
We start with all European firms that have publicly traded equity between 1996 and 2006. The sample was then
16
restricted to non-financial firms with more than $30 million in size to avoid missing and hidden default problem. For
level validation we imposed a further restriction of $300 million in size.

16

Following our practice in North America, size is measured by the sales of the firm for non-financial firms. Whenever the firms total
sales number was not available, we used the book asset value of the firm. This number was further adjusted for inflation effect across
years by adjusting the numbers to a common denomination by using the appropriate consumer price index and exchange rate.

34

TABLE 11

Number of companies by country in the European Module of


Credit Monitor and CreditEdge
Country
Code

Size >=
$30 Million

Size >=
$300 Million

Austria

AUT

112

60

Belgium

BEL

134

77

Switzerland

CHE

232

153

Czech Republic

CZE

75

30

Germany

DEU

796

394

Denmark

DNK

171

77

Spain

ESP

177

124

Finland

FIN

153

81

France

FRA

897

413

Great Britain

GBR

1959

788

Greece

GRC

256

66

Hungary

HUN

37

15

Ireland

IRL

76

37

Iceland

ISL

Israel

ISR

136

48

Italy

ITA

301

172

Luxemburg

LUX

31

23

Netherlands

NLD

238

152

Norway

NOR

232

90

Poland

POL

118

43

Portuguese

PRT

90

37

Russia

RUS

61

57

Slovakia

SVK

16

Slovenia

SVN

Sweden

SWE

311

134

Turkey

TUR

165

55

Country

We also present the results for level validation for subsample of countries that have more than 100 companies of size
$300 million. These countries include Switzerland, Germany, Spain, France, Great Britain, Italy, Netherlands, and
Sweden. The number of firms by country and size is shown in Table 11.

35

Defaults are based on the Moodys KMV Default database and include missed payments, distressed exchanges and
17
insolvency proceedings. For all comparisons against agency ratings we used Moodys ratings.

4.2.3 Timely Default Prediction


In this section, we compare the performance of EDF credit measures against agency ratings in their ability to predict
timely defaults according with methodology described in section 3.1. We create a sample of defaulted firms retaining
monthly observations from 24 months prior to default until 10 months after default. Only those observations were
included in the sample that had non-missing history of EDF credit measures and ratings 24 months prior to default. We
compute the median of the EDF credit measure and the median rating by months to default and overlay the median
EDF and the median rating.
Figure 20 demonstrates that in the event of default, EDF credit measures become elevated 11 months before ratings.
Ratings move later and more abruptly, giving the most signal in the last nine months.

FIGURE 20

Median agency ratings and Moodys KMV EDF values for rated defaulted firms in Europe
from 24 months before default to 10 months after default between 1996 and 2006

4.2.4 Default Predictive Power


EDF credit measures outperform simple Merton model implied default probabilities and Z-Scores in its ability to
discriminate between defaulters and non-defaulters, which can be seen from Figure 21. The Accuracy Ratios for the EDF
credit measure, Merton default probability, and Z-Score are 0.79, 0.70 and 0.61, respectively.

17

To collect defaults, we use numerous printed and online sources from around the world on a daily basis. We use government
fillings, government agency sources, company announcements, news services, specialized default news sources and even sources within
financial institutions to ensure, to the greatest extent possible that we find all defaults. We also keep evidences in electronic format so
that content can be easily verified. As a result, Moodys KMV has the most extensive default database for public firms.

36

We divide the sample into subsets of sizes $30 million to $300 million, and $300 million and above. In both cases the
EDF credit measure outperforms the Merton model implied default probability and Z-Score, as shown in Table 10. All
the measures improve for larger firms.
As a robustness check, we compared the performance of the three measures across time horizons 19962000 and
20012006. The results, presented in Table 10, illustrate that the EDF credit measure outperforms the Merton model
and Z-Score with EDF credit measure and Merton default probability performing better in 19962000 period while
Z-Score has higher Accuracy Ratio in the second period.

FIGURE 21
Cumulative Accuracy Profile curves (CAP) comparing Moodys KMV EDF credit measures,
Z-Scores and Merton default probabilities for European non-financial firms between 1996 and 2006.
The Accuracy Ratios for EDF measure, Z-Score and Merton default probability are 0.79, 0.61 and 0.70,
respectively.
We summarize our findings in this section in Table 12. The results clearly show that the EDF credit measure in Europe
outperforms the other popular alternative in its ability to discriminate good firms from bad firms at a 1-year horizon.

37

TABLE 12

Summary of Accuracy Ratios, across various size buckets and time periods for
European non-financial firms
EDF Credit
Measure

Z-Score

Simple Merton
Model

19962006,
Size >$30 Million

0.79

0.61

0.70

19962000
Size >$30 Million

0.79

0.53

0.71

20012006
Size >$30 Million

0.78

0.64

0.64

19962006,
Size between $30$300 Million

0.75

0.60

0.64

19962006,
Size>$300 Million

0.83

0.65

0.77

Date

4.2.5 Level validation with default data


To validate the accuracy of levels we followed the methodology described in Section 3.3.
Results for the Whole Sample
Figure 22 shows the level validation results for the sample of European firms with size greater than $300 million. The
left panel of the Figure 22 displays the mean, median predicted and actual default rate along with 80% prediction
interval for the default rate. We used an asset correlation of 0.25 to simulate defaults in each year. The right panel of the
Figure 22 presents the posterior distribution for the aggregate shock given the actual default rate and P-values of the
actual default rate, which is the probability of observing default at or lower than the actual default rate.
The predicted default rate tracks the realized default rate well. There are exceptions, however, during times of systematic
shock. For example, 2002 was a year when the markets crashed and there were an unexpectedly high number of defaults
compared to what was predicted by the model. We estimated that the shock was negative 0.29 standard deviations.
Similarly, year 2003 was an uncharacteristically good year for the economy leading to a substantially lower number of
defaults. The graph of the aggregate shocks shows that in year 2003 the economy experienced a positive shock of 1.02
standard deviations that led to that small default rate.
The results show that all realized default rates fall within the prediction interval. The P-values of the realized default rate
range from 17% to 65%, which is within the sampling variability that would be expected.

38

FIGURE 22 Comparison of median predicted default rate with the realized default rate, 19962006
The sample was restricted to European non-financial firms larger than 300 million dollars. We used an asset correlation of 0.25 to
simulate defaults in each year. On the left panel, the gray lines represent the 80% prediction interval for realized default rate, the black
line is the median predicted default rate, the blue line is the mean predicted default rate, and the red dotted line is the realized default
rate. The right panel shows the aggregate shock distribution and P-values. The dark black line, rm50 is the median for the posterior
distribution of the aggregate shock; the grey lines, rm10 and rm90 are the 10th and 90th percentiles; the blue line is the P-value of the
actual default rate, which is the probability of observing a default at or lower than the actual default rate.
We summarize the numbers that underlie Figure 22 in Tables 12 and 13. Table 12 contains number of firms, number of defaults,
median and mean predicted default rate per year, as well as 10th and 90th percentiles for predicted default rate. We find that the mean
predicted default rates are much larger than the median default rates indicating that the correlation effect skews the distribution of
default rates to the left. If we ignored this effect and had simply taken the mean default rate of the sample, we would have falsely
concluded that the model over predicts defaults. Table 13 contains the median aggregate shock, 10th and 90th percentiles of the
aggregate shock and the P-value by year.

39

TABLE 13

Comparison of mean and median predicted number of defaults with the realized
number of defaults between 1996 and 2006

Mean
Predicted
Default
Rate

Median
Predicted
Default
Rate

1996

0.87%

0.40%

1997

0.90%

1998

Realized
Default
Rate

10th
Percentile

90th
Percentile

0.44%

0.00%

2.00%

1596

0.50%

0.37%

0.00%

2.10%

1610

0.67%

0.30%

0.38%

0.00%

1.40%

1588

1999

0.98%

0.50%

0.35%

0.00%

2.20%

1692

2000

0.92%

0.50%

0.38%

0.00%

2.10%

1580

2001

1.34%

0.80%

0.88%

0.10%

3.10%

1360

12

2002

1.92%

1.20%

1.70%

0.20%

4.40%

1409

24

2003

3.09%

2.20%

0.66%

0.50%

6.80%

1513

10

2004

1.65%

1.00%

0.77%

0.20%

3.80%

1563

12

2005

1.14%

0.70%

0.31%

0.10%

2.60%

1627

2006

0.47%

0.20%

0.06%

0.00%

0.80%

1607

Year

Firms

Defaults

The sample was restricted to European firms larger than 300 million dollars.

TABLE 14

Year

Summary of aggregate shock and year-wise probability of realizing the


actual number of defaults between 1996 and 2006

10th Percentile

Median
Aggregate Shock

90th Percentile

1996

-0.30

0.04

0.37

56.09%

1997

-0.16

0.21

0.57

47.61%

1998

-0.44

-0.07

0.28

60.46%

1999

-0.08

0.28

0.62

46.15%

2000

-0.18

0.18

0.53

48.31%

2001

-0.37

-0.08

0.21

55.76%

2002

-0.52

-0.29

-0.06

64.51%

2003

0.70

1.02

1.32

17.02%

2004

-0.04

0.26

0.54

42.95%

2005

0.15

0.54

0.92

38.48%

2006

-0.05

0.57

1.17

49.24%

The sample was restricted to the European firms larger than 300 million dollars.

40

Probability of having
actual defaults or
even lower

Results for Countries having at Least 100 Companies with Size Greater than $300 Million
We restricted the sample to the countries that have at least 100 companies with size greater than $300 million. These
countries tend to have larger equity markets. For these companies, the predicted default rate tracks the realized default
rate very well as shown in Figure 23. The relatively low default rate in year 2003 is indicative of a large positive shock.
The P-values of the realized default rate range from 20% to 69%, which is within the sampling variability that would be
expected.

FIGURE 23

Comparison of median predicted default rate with the realized default rate, 19962006

The sample was restricted to European non-financial firms larger than 300 million dollars from the following countries: Switzerland,
Germany, Spain, France, Great Britain, Italy, Netherlands, and Sweden. We used an asset correlation of 0.25 to simulate defaults in
each year. On the left panel, the gray lines represent the 80% prediction interval for realized default rate, the black line is the median
predicted default rate, the blue line is the mean predicted default rate, and the red dotted line is the realized default rate. The right
panel shows the aggregate shock distribution and P-values. The dark black line, rm50 is the median for the posterior distribution of
the aggregate shock; the grey lines, rm10 and rm90 are the 10th and 90th percentiles; the blue line is the P-value of the actual default
rate, which is the probability of observing a default at or lower than the actual default rate.

4.2.6 Level Validation with CDS Data


The number of defaults observed for larger firms in Europe was less than in the North America, making the power of the
test somewhat weaker compared to that in North America. Therefore, we present another indirect validation of EDF
credit measures in Europe. This test analyzes the level bias in European EDF credit measure relative to that of the U.S.
EDF credit measure. The rationale for the test is based on the assumption that similar risks offer similar premia in the
U.S. and Europe. So, if we subdivide the firms based on EDF categories, then the same EDF categories should have same
aggregate median spreads in CDS markets across the two regions.
For example, if EDF levels in Europe substantially overstated the level of default risk in Europe relative to North
America, then if we were to compare a European firm to a North American firm with a comparable EDF, the European
firm on average would have a substantially lower CDS spread. Conversely, if there were no such systematic bias between
EDF credit measures in North America versus Europe, then the median spread should be approximately the same.
In Figure 24, we compare the median, 25th and 75th percentile CDS spreads for Aa and above and A EDF implied rating
categories. The median spreads as well as 25th and 75th percentiles over time are comparable in the U.S. and Europe,
thereby indicating no relative bias in EDF levels of Europe over that in the U.S.. We also tried this for Baa, Ba, B and

41

12

Caa EDF implied rating categories and found comparable results. The results are shown in Figure 25, and 26
respectively. There was some overlap in the underlying names in the two currencies. However, our findings are robust to
using a completely non-overlapping sample as well. The subinvestment names can be impacted by liquidity risk that can
be different in different regions, thereby making the test less reliable.

Aa and above

FIGURE 24

Comparison of CDS spreads in the U.S. and Europe for Aa and above and A EDF-implied
rating categories

Blue lines represent 25th, median and 75th percentile of the CDS spread in Europe and red lines are similar data for the U.S.

12

The category Aaa is not shown because there were very few observations for CDS contracts in this category.

42

Baa

FIGURE 25

Ba

Comparison of CDS spreads in the U.S. and Europe for Baa and B EDF-implied rating
categories

Blue lines represent 25th, median and 75th percentile of the CDS spread in Europe and red lines are similar data for the U.S.

FIGURE 26

Caa

Comparison of CDS spreads in the U.S. and Europe for B and Caa EDF implied rating
categories

Blue lines represent 25th, median and 75th percentile of the CDS spread in Europe and red lines are similar data for the U.S.

4.2.7 Conclusion
We showed that in Europe, EDF credit Measures lead Agency Ratings in timely default prediction. EDF credit measures
lead other alternative measures in their ability to discriminate good firms from bad firms over time and across various

43

subsections of the data. Model-predicted default rates track realized default rates well and CDS spreads are similar to
those in the U.S. for the same EDF-implied rating categories.

4.3

Asia

In this section, we describe the results obtained in Asia.

4.3.1 Data
We start with all Asian firms that have publicly traded equity from 1996 to 2006. We restrict the sample to nonfinancial firms with more than $30 million in size (unless otherwise specified) to account for hidden or missing
18
defaults. Defaults are based on the Moodys KMV Default database and include missed payments, distressed exchanges,
and insolvency proceedings.
Table 14 shows the number of companies by country for two size categories: above $30 million and above $300 million
that are in Asian module of Credit Monitor and CreditEdge.
We decided to exclude some countries from level validation:

China, because the government intervention default definition is not clear

Australia and New Zealand, because they belong to the Pacific region

Japan, because it has a different economic structure and a hidden default problem

Pakistan and Sri Lanka, because they have a small number of companies

The remaining countries have the most comprehensive default coverage. These countries are Hong Kong, India,
Indonesia, Korea, Malaysia Philippines, Singapore, Thailand, and Taiwan. We ran power and level validation tests
separately for Japan.

18

Size is measured by the sales of the firm for non-financial firms. Whenever the firms total sales number was not available, we used
the book asset value of the firm. This number was further adjusted for inflation effect across years by adjusting the numbers to a
common denomination by using a deflation adjustor calculated internally at Moodys KMV.

44

TABLE 15

Number of companies in Asian Module of Credit Monitor and CreditEdge


by country and size
Country
Code

Size >=
$30 Million

Size >=
$300 Million

Australia

AUS

844

258

China

CHN

1357

385

Hong Kong

HKG

695

231

Indonesia

IDN

200

64

India

IND

567

174

Japan

JPN

3955

2274

Korea

KOR

899

377

Sri Lanka

LKA

19

Malaysia

MYS

643

135

New Zealand

NZL

101

47

Pakistan

PAK

83

24

Philippines

PHL

96

25

Singapore

SGP

494

129

Thailand

THA

360

79

Taiwan

TWN

1196

310

Country

4.3.2 Timely Default Prediction


In this section, we compare the performance of EDF credit measures against agency ratings in their ability to predict
timely defaults according to methodology described in section 3.1. We create a sample of defaulted firms retaining
monthly observations from 24 months prior to default until 10 months after default. Only those observations were
included in the sample that had non-missing history of EDF values and ratings 24 months prior to default. We compute
the median of the EDF credit measure and the median rating by months to default and overlay the median EDF and the
median rating.
Figure 27 demonstrates that in the event of default, EDF credit measures become elevated 10 months before ratings.

45

FIGURE 27 Median agency ratings and Moodys KMV EDF values for all rated defaulted firms in Asia from
24 months before default to 10 months after default between 1996 and 2006. EDF values are displayed
on log scale.

4.3.3 Default Predictive Power


The EDF credit measure has more discriminatory power than Z-Score and Merton Default Probability in Hong Kong,
India, Indonesia, Korea, Malaysia Philippines, Singapore, Thailand and Taiwan as can be seen in Figure 28. The
Accuracy Ratio for the EDF credit measure is 0.67. Contrary to the power tests performed in North America and
Europe, Z-Score outperforms simple Merton model implied default probability in its ability to discriminate between bad
and good firms with Accuracy Ratios being 0.57 and 0.56, respectively.

46

FIGURE 28 Cumulative Accuracy Profile (CAP) curves comparing Moodys KMV EDF credit measures and
Z-Scores for Asian non-financial companies between 10/2001 and 12/2006. The Accuracy Ratios for EDF
measure, Z-Score and Merton Default Probabilities are 0.67, 0.57 and 0.56, respectively.
The EDF credit measure has more discriminatory power than Z-Score and Merton Default Probability in Japan.
Consistent with the results in other nine countries, Z-Score has higher Accuracy ratio than Merton default probability.
CAP curves are presented in Figure 29. Accuracy ratios of EDF credit measure, Merton default probability and Z-Score
are 0.89, 0.79 and 0.77, respectively.

47

FIGURE 29 CAP curves comparing Moodys KMV EDF credit measures and Z-Scores for Japanese nonfinancial companies between 10/2001 and 12/2006. The Accuracy Ratios for the EDF credit measure,
Z-Score, and Merton Default Probabilities are 0.89, 0.79 and 0.77, respectively.

4.3.4 Level Validation


Figure 30 shows the level validation results for the sample of Asian firms (Hong Kong, India, Indonesia, Korea, Malaysia
Philippines, Singapore, Thailand and Taiwan) with size greater than $300 million. The left panel of Figure 30 displays
the mean, median predicted, and actual default rate along with 80% prediction interval for predicted default rate. We
used an asset correlation of 0.25 to simulate defaults in each year. The right panel of Figure 30 presents the posterior
distribution for the aggregate shock given the actual default rate and P-values of the actual default rate, which is the
probability of observing default at or lower than the actual default rate.
Collection of default data in Asia is more difficult than in the U.S. and Europe because of language barriers, poor
reporting of default events, and government intervention to prevent company collapse, which often goes unreported. We
could expect the under prediction of defaults in 1996, 1997, and 1998 because of the severe Asian financial crisis. The
over-prediction of defaults in 2001 and 2002 may reflect market uncertainties regarding the Asian recovery while Europe
and North America were in recessions. The P-values of the realized default rate range from 11% to 87%, which is within
the sampling variability that would be expected.

48

FIGURE 30

Comparison of median predicted default rate with the realized default rate, 19962006

The sample was restricted to Asian non-financial firms larger than 300 million dollars from the following countries: Hong Kong,
India, Indonesia, Korea, Malaysia Philippines, Singapore, Thailand, and Taiwan. We used an asset correlation of 0.25 to simulate
defaults in each year. On the left panel, the gray lines represent 80% prediction interval for predicted default rate, the black line is the
median predicted default rate, the blue line is the mean predicted default rate, and the red dotted line is the realized default rate. The
right panel shows the posterior distribution of the aggregate shock distribution and P-values. The dark black line is the median
aggregate shock; the grey lines, rm10 and rm90 are the 10th and 90th percentiles for the aggregate shock; the blue line is the P-value of
the actual default rate, which is the probability of observing a default at or lower than the actual default rate.

Accuracy of Levels in Japan


Figure 31 presents the level validation results for the sample of Japanese firms with size greater than $300 million. As
above, the left panel of the Figure 31 displays the mean, median predicted, and actual default rate along with 80%
prediction interval for predicted defaults. We used an asset correlation of 0.25 to simulate defaults in each year. The
right panel of the Figure 31 presents the posterior distribution for the aggregate shock given the actual default rate and
P-values of the actual default rate, which is the probability of observing a default at or lower than the actual default rate.
As expected, the EDF credit measure is higher than observed default rate in Japan, due to the practice of banks and
parent companies extending credit to companies that otherwise would default.

49

FIGURE 31

Comparison of median predicted default rate with the realized default rate,19962006

The sample was restricted to Japanese non-financial firms larger than 300 million dollars. We used an asset correlation of 0.25 to
simulate defaults in each year. On the left panel, the gray lines represent the 80% prediction interval for realized default rate, the black
line is the median predicted default rate, the blue line is the mean predicted default rate, and the red dotted line is the realized default
rate. The right panel shows the aggregate shock distribution and P-values. The dark black line, rm50 is the median for the posterior
th
th
distribution of the aggregate shock; the grey lines, rm10 and rm90 are the 10 and 90 percentiles; the blue line is the P-value of the
actual default rate, which is the probability of observing a default at or lower than the actual default rate.

4.3.5 Conclusion
We showed that in Asia, the EDF credit measures lead agency ratings in timely default prediction. For countries where
we have best default coverage, EDF credit measures lead other alternative measures in their ability to discriminate good
firms from bad firms over time and across various subsections of the data. Realized default rate for countries with better
default coverage lies within the prediction interval. In Japan, the EDF model discriminates distressed firms from healthy
firms very well.

4.4

Median EDF by Rating Category across Regions

Four panel graphs in Figure 32 display the median EDF credit measure by rating categories across different regions.
Levels of EDF credit measure for North American non-financial companies, Asia-Pacific non-financial companies, and
global financial companies are comparable for all rating categories. Levels in Europe are a bit lower for better quality
firms in rating categories of A and Baa.

50

100%

100%

Ba
Baa

Ba
Baa

10.0%

EDF8

EDF8

10.0%

1.00%

0.10%

1.00%

0.10%

0.01%

0.01%

01M90

01M95

01M00

01M05

01M10

01M90

North American non-financial companies

01M95

01M00

01M05

100%

100%

1.00%

0.10%

Ba
Baa

10.0%

EDF8

EDF8

Ba
Baa

10.0%

1.00%

0.10%

0.01%

0.01%

01M90

01M95

01M00

01M05

European non-financial companies

FIGURE 32

01M10

Asian-Pacific non-financial companies

01M10

01M90

01M95

01M00

01M05

01M10

Global financial companies

Comparison of median EDF across different regions by Moodys rating categories

CONCLUSION
In this document, we tested the performance of the Moodys KMV EDF credit measure in its timeliness of default
prediction, ability to discriminate good firms from bad firms, and accuracy of levels. Whenever possible, we compared
the performance to other popular alternatives available to the market.
We find that the EDF credit measure performed well on all counts over time and across various subsections of the data.
We also showed that the Moodys KMV model works well not only in North America, but also in Europe and Asia. In
Europe, our findings are especially significant because our European sample consisted of various subregions with
substantially different debt-holding structures and bankruptcy mechanisms which could have adversely impacted the
results if the model was not universal in concept.
While research at Moodys KMV continues to make efforts to make this measure superior and take into account all the
nuances of the data as the markets evolve and become more complex, we feel that as of now, this measure sets a standard
in the industry for a transparent and predictive absolute measure of the probability of default.

51

APPENDIX A: CAP VS. ROC


The most popular validation techniques available today are Cumulative Accuracy Profile (CAP) and Receiver Operating
Characteristic (ROC). CAP has its summary statistic known as the Accuracy Ratio, while ROC has its summary statistic
as the area under the ROC curve.
19

As a specific case, let us think of a sample that has N defaulters and M non-defaulters. The ith firm in the sample is
assigned a default probability pi. Without loss of generality, let us assume the order to be p1 p2p3.pM+N.
Therefore, for each pi, one can assign a set (m(pi),n(pi)), where m(pi) represents the number of non-defaulters that have
probability of default greater than or equal to pi, and n(pi) represents the number of defaulters that have probability of
default greater than or equal to pi. Obviously m(pM+N) = M and n(pM+N) = N. One can translate these numbers to fraction
of defaulters and non-defaulters as fm(pi) and fn(pi) where fm(pi) = m(p i ) represents the fraction of non-defaulters that have
M

a default probability greater than or equal to pi, and fn(pi) = n (p i ) represents the fraction of non-defaulters that have a
N

default probability greater than or equal to pi. Similarly, one can also create an overall fraction
f(pi) = m(p i ) + n (p i ) that represents the fraction of firms in the sample that have a default probability greater than or equal
M+N

to pi.
CAP is now defined as the graph of fn(pi) against f(pi) for all values of pi. ROC is the graph of fn(pi) against fm(pi) for all
values of pi. Alternatively, ROC can also be interpreted as the curve that plots the hit rate against the false alarm rate for
any cut-off C, across all values of C.
Receiving Operating Characteristics is a popular approach borrowed from medical science. ROC curves, also known as
power curves, are well-known ways of establishing the ability of a model to distinguish signals from noise, or in our case,
defaulters from non-defaulters. The basic approach takes a sample of M+N firms, of which M are good firms (nondefaulters) and N are defaulted firms. If we rank the firms in their likelihood of defaulting from the highest potential
defaults to lowest potential defaults, and exclude z% of the riskiest firms from the sample, then we will end up excluding
some actual defaults and some non-defaults. In this way, we end up excluding z/100*(M+N) of the firms.
A perfect model would have excluded all defaults, as long as z/100 is less than N/(M+N). A random model with no
information would exclude zM/100, i.e., z% of non-defaults and zN/100 (i.e., x% of defaults). Let us assume that we
exclude x(z)% of non-defaults, and y(z)% of defaults by excluding the riskiest z% of the sample. By varying z from 0 to
100, we can get various pairs of (x,y). By plotting y against x on an X-Y plot, we should be able to construct a graph that
indicates the Accuracy Ratio of the model. For example, a model with no predictive power should have its (x,y) plot as a
45 degree line. A perfect model should be a flat horizontal line at 100%. Note that both x and y will vary from 0 to
100%. Also, this test needs only ordinal ranking and therefore can be used to test all the various approaches of credit risk
on the same plane.
The area of the CAP curve above the 45-degree line as a fraction of the area of the perfect models CAP curve above the
45-degree line is called the Accuracy Ratio (AR). The area under the ROC curve is called the AUC (Area Under Curve).
This is illustrated in figures 33 and 34. Both of these figures are based on the same population of non-defaulters and
defaulters. Figure 33 shows the Cumulative Accuracy Profile Curve that is the curve outside area A. A perfect model
would have the Cumulative Accuracy Curve represented by the straight line outside area B. The Accuracy Ratio is
A/(A+B). Figure 34 shows the ROC curve outside the shaded area. The shaded area represents the AUC. It can be shown
that 2AUC-1 = AR. For more details and proof of this relationship, refer to Engelmann, Hayden, and Tasche (2003).

19

Defaulters are usually counted over a certain horizon. Therefore these tests are horizon specific. In this document, all tests are for 1year horizon.

52

FIGURE 33

FIGURE 34

53

APPENDIX B: SUMMARY OF ACCURACY RATIOS FOR EDF CREDIT


MEASURES AND AGENCY RATINGS BY YEAR
TABLE 16 Accuracy Ratios for EDF Credit Measures and agency ratings
for U.S non-financial companies by Year at 1-year Horizon
EDF Credit
Measure

Ratings

1996
(EDF: 12/95; Defaults: 1/96-12/96)

0.72

0.80

1997
(EDF: 12/96; Defaults: 1/97-12/97)

0.91

0.80

1998
(EDF: 12/97; Defaults: 1/98-12/98)

0.90

0.76

1999
(EDF: 12/98; Defaults: 1/99-12/99)

0.85

0.75

2000
(EDF: 12/99; Defaults: 1/00-12/00)

0.76

0.67

2001
(EDF: 12/00; Defaults: 1/01-12/01)

0.74

0.65

2002
(EDF: 12/01; Defaults: 1/02-12/02)

0.79

0.58

2003
(EDF: 12/02; Defaults: 1/03-12/03)

0.86

0.77

2004
(EDF: 12/03; Defaults: 1/04-12/04)

0.85

0.84

2005
(EDF: 12/04; Defaults: 1/03-12/03)

0.89

0.75

2006
(EDF: 12/05; Defaults: 1/06-12/06)

0.96

0.82

Date

54

TABLE 17 Accuracy Ratios for EDF Credit Measures and agency ratings
for U.S non-financial companies by year at 5-year Horizon
Date
1991
(EDF: 12/90; Defaults: 1/91-1/96)
1992
(EDF: 12/91; Defaults: 1/92-1/97)
1993
(EDF: 12/93; Defaults: 1/93-1/98)
1994
(EDF: 12/93; Defaults: 1/94-1/99)
1995
(EDF: 12/95; Defaults: 1/95-1/00)
1996
(EDF: 12/95; Defaults: 1/96-1/01)
1997
(EDF: 112/96; Defaults: 1/97-1/02)
1998
(EDF: 12/97; Defaults: 1/98-1/03)
1999
(EDF: 12/98; Defaults: 1/99-1/04)
2000
(EDF: 12/99; Defaults: 1/00-1/05)
2001
(EDF: 12/00; Defaults: 1/01-1/06)
2002
(EDF: 12/01; Defaults: 1/02-1/07)

EDF Credit
Measure

Ratings

0.77

0.73

0.84

0.81

0.79

0.76

0.67

0.66

0.68

0.58

0.68

0.62

0.66

0.62

0.67

0.64

0.65

0.65

0.61

0.64

0.65

0.61

0.71

0.60

We used EDF data up until 12/2002 and default data up until 12/2006.

55

REFERENCES
1.

Irina Korablev, 2005, Power and Level Validation of the EDF Credit Measure in the European Market.

2.

Jeff Bohn, Navneet Arora, & Irina Korablev, 2005, Power and Level Validation of the Moodys KMV EDF
Credit Measure in the U.S. Market.

3.

Agrawal, Deepak, Navneet Arora, and Jeffrey Bohn, 2004, Parsimony in Practice: An EDF-based Model of
Credit Spreads, Moodys KMV White Paper.

4.

Arora, Navneet, Jeffrey Bohn, and Fanlin Zhu, 2005, Reduced vs. Structural Models of Credit Risk: A Case
Study of Three Models, Moodys KMV Technical Document.

5.

Crosbie, Peter, and Jeffrey Bohn, 2003, Modeling Default Risk, Moodys KMV Technical Document,.

6.

Das, Ashish, Amnon Levy, Anil Gurnaney, Jeffrey Bohn, Peter Crosbie and Stephen Kealhofer, 2004,
Modeling Portfolio Risk, Moodys KMV Technical Document.

7.

Douglas W. Dwyer & Shisheng Qu, 2007, EDF 8.0 Model Enhancements.

8.

Douglas W. Dwyer, 2007, The Distribution of Defaults and Bayesian Model ValidationJournal of Model Risk
Validation, Volume 1, no 1.

9.

Engelmann, Berndt, Evelyn Hayden, and Dirk Tasche, 2003, Testing Rating Accuracy, Risk, January 2003.

10. Eom, Young Ho, Jean Helwege, and Jing Zhi Huang, 2003, Structural Models of Corporate Bond Pricing:
An Empirical Analysis, Review of Financial Studies.
11. Hull, John, 1999, Options, Futures and Other Derivatives, Prentice Hall Publications, Fourth Edition.
12. Johnson, Norman, Samuel Kotz, and Adrienne Kemp, 1993, Univariate Discrete Distributions, 2nd Ed., NY:
Wiley.
13. Kurbat, Matt, and Irina Korablev, 2002, Methodology for Testing the Level of the EDF Credit Measure,
Moodys KMV White Paper.
14. Lyden, Scott, and David Saraniti, 2000, An Empirical Analysis of Classical Theory of Corporate Security
Valuation, Research Paper, Barclays Global Investors.

56

Вам также может понравиться