Академический Документы
Профессиональный Документы
Культура Документы
AUTHORS
Irina Korablev
Douglas Dwyer
ABSTRACT
In this paper, we validate the performance of Moodys KMV EDF credit measures in its
timeliness of default prediction, ability to discriminate good firms from bad firms, and
accuracy of levels in three regions: North America, Europe, and Asia. We focus on the period
19962006 for most of our tests. Wherever possible, we compare the performance to that of
other popular alternatives, such as agency ratings, Moodys KMV RiskCalc EDF credit
measures, Altmans Z-Scores, and a simpler version of the Merton model. We find that EDF
credit measures perform consistently well across different time horizons, and different
subsamples based on firm size and credit quality. Our tests indicate that EDF credit measures
provide a very useful measure of credit risk that can be applied throughout the world.
Copyright 2007, Moodys KMV Company. All rights reserved. Credit Monitor, CreditEdge, CreditEdge Plus,
CreditMark, DealAnalyzer, EDFCalc, Private Firm Model, Portfolio Preprocessor, GCorr, the Moodys KMV logo,
Moodys KMV Financial Analyst, Moodys KMV LossCalc, Moodys KMV Portfolio Manager, Moodys KMV Risk
Advisor, Moodys KMV RiskCalc, RiskAnalyst, Expected Default Frequency, and EDF are trademarks owned by of MIS
Quality Management Corp. and used under license by Moodys KMV Company.
Published by:
Moodys KMV Company
To Learn More
Please contact your Moodys KMV client representative, visit us online at www.moodyskmv.com, contact
Moodys KMV via e-mail at info@mkmv.com, or call us at:
NORTH AND SOUTH AMERICA, NEW ZEALAND AND AUSTRALIA, CALL:
1 866 321 MKMV (6568) or 415 874 6000
EUROPE, THE MIDDLE EAST, AFRICA AND INDIA, CALL:
44 20 7280 8300
FROM ASIA CALL:
813 3218 1160
TABLE OF CONTENTS
1
INTRODUCTION .................................................................................................. 5
2.1
2.2
2.3
2.4
2.5
3.2
3.3
3.4
3.5
4.2
Europe ..................................................................................................................................... 32
4.2.1 Diversity in Bankruptcy Mechanisms and Creditor Protection .................................. 32
4.2.2 Data.............................................................................................................................. 34
4.2.3 Timely Default Prediction............................................................................................ 36
4.2.4 Default Predictive Power ............................................................................................ 36
4.2.5 Level validation with default data ............................................................................... 38
4.2.6 Level Validation with CDS Data ................................................................................... 41
4.2.7 Conclusion ................................................................................................................... 43
4.3
Asia .......................................................................................................................................... 44
4.3.1 Data.............................................................................................................................. 44
4.3.2 Timely Default Prediction............................................................................................ 45
4.3.3 Default Predictive Power ............................................................................................ 46
4.3.4 Level Validation ........................................................................................................... 48
4.3.5 Conclusion ................................................................................................................... 50
4.4
CONCLUSION.................................................................................................... 51
APPENDIX B:
SUMMARY OF ACCURACY RATIOS FOR EDF CREDIT MEASURES AND
AGENCY RATINGS BY YEAR ........................................................................................ 54
INTRODUCTION
The new Basel Capital Accord states: The methodology for assigning credit assessments must be rigorous, systematic,
and subject to some form of validation based on historical experience. There are two important components to this
validation process: the ability to predict defaults and the accuracy of the default predictive measure.
The first criterion implies that a credit measure should be dynamic enough to be a meaningful and timely signal of
deteriorating credit quality or an impending credit event. In this regard, the Basel Accord states: Assessments must be
subject to ongoing review and responsive to changes in financial condition. Before being recognized by supervisors, an
assessment methodology for each market segment, including rigorous back-testing, must have been established for at
least one year. This also means that the credit assessment technology should have the ability to distinguish between
defaulters and non-defaulters. It should not allow defaulters to enter the sample while trying to create a sample of good
quality firms (Type I Error). Conversely, it should not exclude good quality firms from the sample while trying to
exclude potential defaulters (Type II Error).
The second criterion is focused on the accuracy of the credit assessment measure so that it can be useful to banks and
other financial institutions in their efforts toward risk measurement, valuation, and capital allocation. The Basel Accord
states: Banks must have a robust system in place to validate the accuracy and consistency of rating systems processes,
and the estimation of PDs (Probabilities of Default).
The objective of this document is to compare the performance, based on the above validation criteria, of EDF credit
measures with some of the other popular credit assessment approaches. The popular approaches that we consider are the
following:
Agency ratings
Altmans Z-Score
In this paper we present our test results for three regions: North America, Europe and Asia. The rest of the paper is
organized as follows: Section 2 discusses briefly the credit assessment approaches that we consider in our paper. Section 3
highlights the empirical methodology we follow to compare the approaches. Section 4 presents the results of our tests by
1
region and interprets the economic meaningfulness of these results. Section 5 concludes the paper.
Agency ratings
Altmans Z-Scores2
Section 4.1 presents the results for North America, section 4.2 presents the results for Europe, and section 4.3 presents the results for
Asia.
2
For reasons explained in the next two sections, not all the approaches can be subjected to tests on all the criteria. We try to include as
many of these approaches as possible in our test of each criterion.
2.1
The structural view on credit risk was first made commercially viable with the introduction of the Vasicek-Kealhofer
(VK) model. This model offers a rich framework that treats equity as a perpetual down-and-out option on the
underlying assets of the firm. This framework incorporates five different classes of liabilities: short-term liabilities, longterm liabilities, convertible debt, preferred shares, and common shares. To overcome the regular problems encountered
by structural models due to the assumption of normality, the VK model uses an empirical mapping based on actual
3
default data to get the default probabilities, known as EDF credit measures and offered by Moodys KMV. Volatility is
estimated through a Bayesian approach that combines a comparables analysis with an iterative approach.
EDF credit measures are the outputs of Moodys KMV Credit Monitor and CreditEdge applications. An EDF credit
measure is a quantitative measure of credit quality. More specifically, an EDF credit measure is an estimate of the
physical probability of default for a given firm. For an overview of the EDF credit measure, see Crosbie and Bohn
(2003).
In 2007, Moodys KMV released EDF 8.0, which refines the mapping of the Distance-to-Default to the EDF credit
measure using a much larger default database observed over a longer time period. Details of the new model enhancement
can be found in Dwyer and Qu (2007).
The EDF estimates are now bounded between 0.01% (for an EDF value of 0.01) and 35% (for an EDF value of 35).
Moodys KMV offers a term-structure of EDF credit measures for 1 to 10 years and an extrapolation scheme to get
shorter-term EDF credit measures. The risk free rate used in the calculation of EDF credit measures is now updated
monthly.
2.2
Agency Ratings
Moodys Investors Service, Standard and Poors Corporation, and other well-known rating agencies around the world
have been assigning credit ratings to major borrowers for decades. These are ordinal measures of credit measures (i.e.,
they help rank firms by their quality of credit). These ratings have established international credibility because of the
long history of rating agencies, and the extensive testing of their relative performance.
2.3
Moodys KMV RiskCalc is designed to calculate EDF credit measures for private companies. Private companies are
typically smaller than public companies and are not required to file financial statements with SEC.
The RiskCalc model incorporates aspects of both the structural, market-based approach in the form of industry-level
distance-to-default measures, and the localized financial statement-based approach. While it incorporates equity market
information at the aggregate level, RiskCalc does not take advantage of the equity information of the specific company.
We used the RiskCalc v3.1 U.S. model to obtain RiskCalc EDF credit measures for the set of publicly traded companies.
Comparing public firm EDF credit measures to RiskCalc EDF credit measures computed on public firms represents an
out-of-universe test of RiskCalc.
2.4
The Merton model of risky debt is the original structural model of credit risk, and perhaps the most significant
contribution to the area of quantitative credit risk research. This model assumes that equity is a call option on the value
of assets of the firm. From this insight, the value of debt can be derived based on the observed equity value. The default
event is modeled as the firms asset value falling below a threshold level (i.e., default barrier). Given the default barrier,
and the asset value parameters, the probability of default can be estimated for various horizons. A detailed description of
4
this model can be found in most standard finance textbooks.
3
4
See Eom, Helwege, and Huang (2003) for details of the discussion.
See, for example, Hull (1999).
For our specific tests, the model has been implemented as:
Default Pointi,Merton = Short Term Liabilities + 0.5 Long Term Liabilities
The default probability for a firm i for a time horizon t is computed as:
AVLi
2
ln
+ ( i 0.5 i ) t
Default
Point
i,Merton
PDi =
i t
i = iequity
EVLi EVLi
AVLi AVLi
(1)
(2)
(3)
AVLi
2
ln
+ ( r + 0.5 i ) t
Default
Point
i,Merton
d1 =
i t
d 2 = d1 i t
are the asset volatility, equity volatility, asset value and equity value of firm i, respectively.
(x) is the cumulative normal distribution function. i is the drift rate for the asset returns of firm i while r is the
iequity is computed as the standard deviation of three years of weekly equity returns for company
5
i. Asset value AVLi is computed by solving equations (2) and (3) simultaneously.
2.5
Altmans Z-Score
Altmans Z-Score came as a response to the need for identifying the financial health of any business based on observable
accounting and market ratios. This original measure was developed in 1968 by Edward Altman, whose Z-Score is
available in various forms. We chose the public firm form, which includes market capitalization in the leverage ratio, and
calculated Z-Scores as follows:
In contrast to the two equations and two unknowns, we use an iterative approach to solve for empirical volatility which is combined
with modeled volatility in a Bayesian fashion.
Z = ( X 1 + X 2 + X 3 + X 4 + X 5 )
(4)
Where
X 1 = 1.2
CurrentLiabilities
BookAssetValue
Retained Earnings
Book Asset Value
X 4 = 0.6
Market Capitalization
Book Value of Liabilities
Sales
Book Asset Value
EMPIRICAL METHODOLOGY
In this section, we describe the methodology we chose for tests of each criterion.
3.1
Timeliness measures how many months before impending credit event EDF credit measures give signal of deteriorating
credit quality. To test timeliness, we create a sample of defaulted firms, retaining monthly observations from 24 months
prior to default up to12 months after default. We compute the median EDF credit measure and the median Moodys
rating by months to default. We overlay and compare the median EDF credit measure and the median Moodys rating.
For testing timeliness against rating, we use the Moodys rating. To ensure that the measure has stood the test of time
and the rating grades and size, we also provide the analysis, wherever possible, for the subsets of data based on time
period:
19962000
3.2
While a default predictive measure can be timely for warning of impending defaults, it may not be so effective in
distinguishing a good firm from a bad firm. The calibration of the model may be on the conservative side inflating the
default probability of all suspect names, of which some names might not be genuinely distressed. In this case, even
though one could claim that the model performed well in predicting impending defaults, it would be fairly mediocre in
its ability to distinguish good firms from bad firms. One of the essential features of a good model is that it should be
sophisticated enough to differentiate bad (genuinely distressed) firms from good (false alarms) firms. There are two wellknown approaches to testing a model for its power:
Cumulative Accuracy Profile (CAP) with its output known as Accuracy Ratio (AR).
Receiver Operating Characteristic (ROC) with its output known as Area Under Curve (AUC).
Typically, the larger the Accuracy Ratio or Area Under Curve, the better the model. In extreme cases, a totally random
model that bears no information on impending defaults has AR = 0, and AUC = 0.5. For a perfect model,
AR = AUC = 1. The two approaches are equivalent with AR = 2AUC-1. A more detailed discussion can be found in
Appendix A.
In this article, we use the Cumulative Accuracy Profile approach, and provide AR as our output. We compared EDF
credit measures to:
Ratings
3.3
The level validation of EDF credit measures verifies how well the models predicted default rates track realized default
rates. We employ the same methodology described in Bohn, Arora and Korablev (2005) which was first developed in
Kurbat and Korablev (2002). The procedure is summarized into the following four steps:
1.
Using Monte Carlo technique, we simulate asset value movements based on a single factor Gaussian model to
capture correlated defaults.
2.
We determine default/non-default state based on the level of each firms EDF credit measure and each simulation
outcome.
3.
We compare the actual default rate to the median, 10th percentile and 90th percentile of the simulated distribution.
4.
We compute the probability of observing a default rate less than or equal to the realized default rate given the model
and the correlation coefficient.
We extend this methodology by using Bayesian methods to compute the posterior distribution of the aggregate shock
given the realized default rate, the model, and the correlation coefficient. The extension to the original methodology is
developed in Dwyer (2007).
values and the assumed correlation model. This prediction interval implies that eighty percent of the time the realized
6
default rate should lie within the 10th and the 90th percentiles.
This prediction interval differs from the concept of a confidence interval. An x% confidence interval is random interval for which
the probability of it holding the true value of a parameter is x%. In our context here, an x% prediction interval has the interpretation
that x% of the time the realized default rate will be within this range given the EDFs levels and the correlation model.
10
3.4
This test analyzes the level bias in European EDF credit measures relative to that of U.S. EDF credit measures. The
rationale for the test is based on the assumption that similar risks should offer similar premium in the U.S. and Europe.
We compare the median as well as 25th and 75th percentile CDS levels of two regions: U.S. and Europe across
EDF-implied rating groups. The same EDF categories should have same aggregate median spreads in CDS market across
two regions. We used Mark-It composite CDS data from January 2003 to December 2006. The Europe region is based
on the following currency information: Euro, Austrian Schilling, Belgian Franc, Swiss Franc, Czech Republic Koruna,
Deutsche Mark, Danish Kroner, Spanish Peseta, Finnish Markka, French Franc, Greek Drachmae, Hungarian Forint,
and British Pound. The U.S. region is based on the U.S. dollar.
3.5
We calculate and compare median EDF credit measures for North American non-financial companies, Asian-Pacific
non-financial companies, European non-financial companies and global financial companies by several rating categories.
In the absence of other measures of credit risk, e.g., spreads or defaults, a comparison with rating provides a sanity check
on the rank ordering of risk produced by the EDF credit measure and the comparableness of level of the EDF credit
measure across geographies.
11
EMPIRICAL RESULTS
In this section, we describe empirical results.
4.1
North America
In this section, we describe empirical results obtained in North America. Results are separated into U.S. and North
American companies that are headquartered outside of the U.S. These companies are predominantly headquartered in
Canada, Bermuda and the Cayman Islands.
4.1.1 Data
We start with all U.S. firms that have publicly traded equity from 19962006, unless otherwise specified. We restrict the
7
sample to non-financial firms with more than $30 million in size. For level validation we impose further restriction of
$300 million in size.
We also present results for comparable North American firms that are outside of the U.S. (Canada, Bermuda, Cayman
Islands, Bahamas, Belize, Panama, Virgin Islands, and Netherlands Antilles). Table 1 shows the countries and the
number of firm-months in each country that constitute North American module in Credit Monitor and CreditEdge.
Outside of the U.S., the largest countries are Canada, Bermuda and the Cayman Islands.
TABLE 1
Number of Observations
(firm-month)
Netherlands Antilles
776
Bahamas
440
Belize
Bermuda
Canada
85
3,552
153,971
Cayman Islands
975
Panama
245
USA
Virgin Islands
1,127,452
491
Size is measured by the sales of the firm for non-financial firms. Wherever the firms total sales number was not available, we used
the book asset value of the firm. This number was further adjusted for inflation effect across years by adjusting the numbers to a
common denomination by using a deflation adjustor calculated internally at Moodys KMV.
8
To collect defaults, we use numerous printed and online sources from around the world on a daily basis. We use government fillings,
government agency sources, company announcements, news services, specialized default news sources and even sources within
financial institutions to ensure to the greatest extent possible that we find all defaults. We also keep evidences in electronic format so
that content can be easily verified. As a result, Moodys KMV has the most extensive default database for public firms.
12
1996-2000
2001-2006
The period 19962000 is shown on the left panel of Figure 4, and the period 20012006 is shown on the right panel of
Figure 4. Both EDF credit measures and ratings start at a higher level 24 months prior to default in the latter half of the
sample. EDF credit measures continued to lead the agency rating in each subperiod, indicating that EDF credit measures
indeed provide a more timely warning of impending defaults.
EDF measure is
leading rating by
11 months
FIGURE 3 Comparison of median agency ratings with Moodys KMV EDF values for rated defaulted firms
in the U.S. from 2 years before default to 1 year after default between 1996 and 2006
13
FIGURE 4 Comparison of median agency ratings with Moodys KMV EDF values for rated defaulted firms
in the U.S. from 2 years before default to 1 year after default for subsamples: 19962000 (left panel)
and 20012006 (right panel)
19962000
20012006
14
FIGURE 5 Cumulative Accuracy Performance (CAP) curves comparing Moodys KMV EDF credit
measures and agency ratings for U.S. non-financial companies between 1996 and 2006. The Accuracy
Ratios for EDF measure and agency rating are 0.88 and 0.75, respectively.
Table 2 illustrates the results for the subsamples. We find that the EDF credit measure substantially outperforms ratings,
in all categories by at least 12%.
TABLE 2 Accuracy Ratios by category for EDF Credit Measures and
agency ratings for U.S. non-financial companies
EDF Credit
Measure
Ratings
19962006
0.88
0.75
19962000
0.87
0.75
20012006
0.88
0.75
19962006,
Size > $30 Million
0.88
0.75
19962006,
Size $30-$300 Million
0.75
0.57
19962006,
Size> $300 Million
0.89
0.76
Date
15
We also calculated Accuracy Ratios at the horizons longer than one year. The results are presented in Table 3. EDF
credit measures have more discriminatory power than agency ratings at all horizons, but the difference is smaller at
longer horizons.
TABLE 3 Accuracy Ratios of one- to five-year EDF credit measures and agency ratings
for U.S. non-financial companies between 1991 and 2006
EDF Credit
Measure
Ratings
Number of
Observations
Number of
Defaults
One-year EDF
credit measure
0.88
0.76
2031
354
Two-year EDF
credit measure
0.81
0.73
1926
374
Three-year EDF
credit measure
0.77
0.71
1917
385
Four-year EDF
credit measure
0.72
0.7
1892
400
Five-year EDF
credit measure
0.69
0.68
1850
404
The Accuracy Ratios (AR) for both the EDF credit measure and agency rating decreases with horizon. The difference between ARs
becomes more compressed at longer horizons.
Figure 6 and Figure 7 present the Accuracy Ratios for the EDF credit measure and agency rating by year at one- and
9
five-year horizons respectively. For each year, we used the EDF credit measure as of the last market day of the prior year
to predict default during the next one or five years.
At a one-year horizon, the EDF credit measure has better discriminatory power than agency rating in all years, except
1996, which had the least number of defaults. At a five-year horizon, the EDF credit measure also outperforms agency
rating in all years except 2000.
The numbers underlying Figures 6 and 7 are summarized in Tables 15th and 16th of Appendix B.
16
1.00
0.90
0.80
0.70
0.60
0.50
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
Agency Rating
FIGURE 6 Accuracy Ratios for EDF credit measures and agency ratings for U.S. non-financial companies
by year at the one-year horizon
1.00
0.90
0.80
0.70
0.60
0.50
1991 1992
1996 1997
2001 2002
Agency Rating
FIGURE 7 Accuracy Ratios for EDF Credit Measures and agency ratings for U.S. non-financial
companies by year at the five-year horizon
EDF Credit Measure vs. Merton Default Probability and Z-Score
In this section we compare the performance of EDF credit measures to the Merton models implied default probabilities
and Z-Scores as described in Section 2. The sample period used is between 1996 and 2006. Unlike the rated firms,
which are usually larger and higher profile, some of the unrated firms can be very small and their defaults can go
unnoticed. In some cases, there can be some informal negotiations or bailouts, avoiding the default. These cases are likely
17
10
to contaminate our results. Therefore we filtered out very small firms (size < 30 million dollars) from our sample. For
the entire period 19962006, the results are shown in Figure 8. The results are presented on a joined sample of Z-Scores,
Merton default probabilities, and EDF credit measures, which require each of these values to be non-missing.
We find that the EDF credit measure substantially outperforms Merton default probability and Z-Score in terms of their
ability to discriminate good firms from bad firms with their Accuracy Ratios at 0.82, 0.72, and 0.66 respectively. We
further divide the sample into subsets of sizes 30 million dollars to 300 million dollars, and 300 million dollars and
above. In both cases, the EDF credit measure outperforms the Merton model and Z-Score, as shown in Table 4.
Once again, as a robustness check, we compared the performance of the two measures across the time horizons
1996-2000, and 20012006. The results are shown in Table 4. As expected, our results are fairly robust with EDF credit
measures outperforming Merton default probabilities and Z-Scores across both horizons.
FIGURE 8 Cumulative Accuracy Performance (CAP) curves comparing Moodys KMV EDF credit measures,
Merton default probability and Z-Scores for U.S. non-financial companies between 1996 and 2006.The
Accuracy Ratios for EDF measure, Merton Default Probability and Z-Score are 0.82, 0.72 and 0.66
respectively.
10
Size is measured by the sales of the firm for non-financial firms. Whenever the firms total sales number was not available, we used
the book asset value of the firm. This number was further adjusted for inflation effect across years by adjusting the numbers to a
common denomination by using a deflation adjustor calculated internally at Moodys KMV.
18
TABLE 4 Summary of Accuracy Ratios across various size buckets and time horizons for EDF
credit measure, Merton default probability, and Z-Score for U.S. non-financial companies
EDF Credit
Measure
Z-Score
Merton Default
Probability
1996-2006,
Size >$30Mln
0.82
0.66
0.72
1996-2000,
Size >$30Mln
0.82
0.66
0.73
2001-2006,
Size >$30Mln
0.82
0.67
0.71
1996-2006,
Size $30-$300 Million
0.76
0.65
0.67
1996-2006,
Size> $300 Million
0.88
0.66
0.77
Date/Size
11
Size is measured by the sales of the firm for non-financial firms. Wherever the firms total sales number was not available, we used
the book asset value of the firm. This number was further adjusted for inflation effect across years by adjusting the numbers to a
common denomination by using a deflation adjustor calculated internally at Moodys KMV.
19
FIGURE 9 Cumulative Accuracy Performance (CAP) curves comparing Moodys KMV EDF credit
measures and RiskCalc EDF credit measures between 1996 and 2006 for U.S. non-financial
companies. The Accuracy Ratios for EDF measure and RiskCalc EDF measure are 0.82 and 0.68
respectively.
20
TABLE 5
Summary of Accuracy Ratios for EDF Credit Measures and RiskCalc EDF Credit
Measures for U.S. non-financial companies by different size buckets and time periods
EDF Credit
Measure
RiskCalc EDF
Credit Measure
1996-2006,
Size >$30 Million
0.82
0.68
1996-2000,
Size >$30 Million
0.81
0.68
2001-2006,
Size >$30 Million
0.83
0.68
1996-2006,
Size $30-300 Million
0.76
0.64
1996-2006,
Size>$300 Million
0.89
0.72
Date / Size
The EDF credit measure effectively discriminates between good and bad credits. It performed better than Z-Score,
RiskCalc for private firms applied for publics, and simple implementation of a Merton model. It leads rating changes in
predicting defaults and it performs well across multiple cuts of the data and multiple horizons.
12
The exception to this is the Merton model but the default probabilities are too low as implied by the Merton model, and therefore it
would usually underestimate the predicted number of defaults.
21
Figure 10 presents the posterior distribution for the aggregate shock given the actual default rate and P-values of the
actual default rate, which is the probability of observing a default at or lower than the actual default rate.
The predicted default rate clearly tracks the realized default rate very well. All predicted default rates fall within the
confidence set. The exception is year 2003, which was an uncharacteristically good year for the economy leading to a
substantially lower number of defaults. In year 2003, to explain the low default rate, we estimate that the U.S. economy
received a positive 0.84 standard deviation shock relative to market expectations. Such a positive shock is consistent with
the high returns on the S&P 500 observed during that year. The P-values of the realized default rate range from 21% to
75%, which is within the sampling variability that would be expected.
FIGURE 10
Comparison of median predicted default rate with the realized default rate, 19912006
The sample was restricted to U.S. firms larger than 300 million dollars and EDF credit measure less than 35%. We used an asset
correlation of 0.19 to simulate defaults in each year. On the left panel, the gray lines represent the 80% prediction interval for realized
default rate, the black line is the median predicted default rate, the blue line is the mean predicted default rate, and the red dotted line
is the realized default rate. The right panel shows the aggregate shock distribution and P-values. The dark black line, rm50 is the
median for the posterior distribution of the aggregate shock; the grey lines, rm10 and rm90 are the 10th and 90th percentiles; the blue
line is the P-value of the actual default rate, which is the probability of observing a default at or lower than the actual default rate.
We summarize the numbers that underlie Figure 10 in Tables 6 and 7. Table 6 contains the number of firms, number of
defaults, median and mean predicted default rate per year as well as the 10th and 90th percentiles for predicted default
rate. It is clear from this table that the correlation effect skews the distribution of default rates to the left. If we ignored
this effect and had simply taken the mean default rate of the sample, we would have grossly over-predicted the realized
default rate. Table 7 contains the median aggregate shock, the 10th and 90th percentiles of the aggregate shock, and the
p-value by year.
22
TABLE 6
Comparison of mean and median predicted default rate with the realized
default rate between 1991 and 2006
Mean
Predicted
Default
Rate
Median
Predicted
Default
Rate
1991
2.3%
1.7%
1992
1.4%
1993
Realized
Default
Rate
10th
percentile
90th
percentile
2.5%
0.5%
4.9%
1554
39
1.0%
1.0%
0.2%
3.2%
1549
15
1.3%
0.9%
0.9%
0.2%
2.9%
1639
15
1994
1.1%
0.7%
0.6%
0.1%
2.5%
1775
10
1995
1.2%
0.7%
0.9%
0.1%
2.6%
1847
16
1996
1.2%
0.8%
0.9%
0.2%
2.8%
1906
17
1997
1.2%
0.8%
0.8%
0.2%
2.6%
2054
17
1998
1.1%
0.7%
0.9%
0.1%
2.4%
2114
20
1999
1.8%
1.2%
1.0%
0.3%
3.9%
2106
22
2000
2.6%
1.9%
1.9%
0.5%
5.5%
2042
38
2001
3.6%
2.8%
2.7%
0.8%
7.3%
1783
48
2002
2.5%
1.9%
1.8%
0.5%
5.4%
1707
31
2003
3.0%
2.3%
1.0%
0.6%
6.2%
1635
16
2004
1.2%
0.8%
0.7%
0.2%
2.8%
1699
12
2005
0.8%
0.5%
1.0%
0.1%
1.9%
1806
18
2006
0.7%
0.4%
0.2%
0.1%
1.5%
1835
Year
Firms
Defaults
The sample was restricted to U.S. firms larger than 300 million dollars with EDF credit measures less than 35%.
23
TABLE 7 Summary table of aggregate shock and year-wise probability of realizing the
actual number of defaults between 1991 and 2006
Year
10th Percentile
Median
Aggregate Shock
90th Percentile
Probability of having
actual defaults or
even lower
1991
-0.64
-0.42
-0.19
68.7%
1992
-0.28
0.03
0.34
51.7%
1993
-0.37
-0.07
0.23
57.9%
1994
-0.16
0.19
0.52
47.3%
1995
-0.42
-0.13
0.15
58.9%
1996
-0.35
-0.06
0.23
54.7%
1997
-0.35
-0.06
0.22
57.7%
1998
-0.52
-0.25
0.01
64.2%
1999
-0.11
0.15
0.40
47.7%
2000
-0.20
0.02
0.23
51.3%
2001
-0.17
0.04
0.24
49.5%
2002
-0.21
0.03
0.26
52.0%
2003
0.54
0.84
1.13
21.3%
2004
-0.20
0.13
0.45
51.4%
2005
-0.86
-0.58
-0.31
74.5%
2006
-0.02
0.46
0.92
45.1%
The sample was restricted to U.S. firms larger than 300 million dollars with EDF credit measures less than 35%.
24
FIGURE 11
Comparison of median predicted default rate with the realized default rate, 19912006
The sample was restricted to U.S. firms larger than 300 million dollars and EDF credit measure equal to 35%. We used an asset
correlation of 0.181 to simulate defaults in each year. On the left panel, the gray lines represent the 80% prediction interval for
realized default rate, the black line is the median predicted default rate, the blue line is the mean predicted default rate, and the red
dotted line is the realized default rate. The right panel shows the aggregate shock distribution and P-values. The dark black line, rm50
is the median for the posterior distribution of the aggregate shock; the grey lines, rm10 and rm90 are the 10th and 90th percentiles; the
blue line is the P-value of the actual default rate, which is the probability of observing a default at or lower than the actual default rate.
We summarize the numbers that underlie Figure 11 in Tables 8 and 9. Table 8 contains the number of firms, the
number of defaults, the median and mean predicted default rate per year, as well as the 10th and 90th percentiles for the
predicted default rate. It is clear from this table that the correlation effect skews the distribution of default rates to the
left. Table 9 contains the median aggregate shock, the 10th and 90th percentiles of the aggregate shock, and the P-value by
year.
25
TABLE 8
Comparison of mean and median predicted default rate with the realized
default rate between 1991 and 2006
Mean
Predicted
Default
Rate
Median
Predicted
Default
Rate
1991
35.0%
33.4%
1992
35.0%
1993
Realized
Default
Rate
10th
Percentile
90th
Percentile
40.0%
12.7%
59.7%
30
12
33.4%
24.0%
12.3%
60.3%
25
35.0%
33.5%
33.3%
11.8%
60.9%
21
1994
35.0%
33.5%
11.8%
11.2%
61.9%
17
1995
35.0%
33.7%
15.4%
10.4%
63.5%
13
1996
35.0%
33.6%
12.5%
11.0%
62.2%
16
1997
35.0%
34.2%
11.1%
9.2%
67.1%
1998
35.0%
33.6%
66.7%
10.8%
62.6%
15
10
1999
35.0%
33.4%
35.1%
13.1%
59.2%
37
13
2000
35.0%
33.5%
38.3%
13.6%
58.8%
47
18
2001
35.0%
33.5%
40.2%
14.5%
57.8%
107
43
2002
35.0%
33.5%
41.0%
13.9%
58.4%
61
25
2003
35.0%
33.5%
38.0%
14.1%
58.2%
71
27
2004
35.0%
33.4%
23.1%
12.4%
60.1%
26
2005
35.0%
33.5%
10.0%
11.7%
61.1%
20
2006
35.0%
33.5%
16.7%
11.4%
61.6%
18
Year
Firms
The sample was restricted to U.S. firms larger than 300 million dollars and EDF credit measure equal to 35%.
26
Defaults
TABLE 9
Year
Summary table of aggregate shock and year-wise probability of realizing the actual
number of defaults between 1991 and 2006
Median
Aggregate Shock
10th Percentile
The probability of
having actual defaults
or even lower
90th Percentile
1991
-0.86
-0.30
0.24
63.1%
1992
-0.21
0.44
1.06
30.6%
1993
-0.65
0.00
0.63
50.0%
1994
0.13
0.94
1.76
10.8%
1995
-0.14
0.69
1.54
17.0%
1996
0.07
0.88
1.71
12.0%
1997
-0.21
0.72
1.68
12.4%
1998
-1.96
-1.23
-0.53
92.8%
1999
-0.60
-0.08
0.42
53.8%
2000
-0.71
-0.24
0.22
60.2%
2001
-0.68
-0.35
-0.04
64.4%
2002
-0.79
-0.38
0.03
65.7%
2003
-0.63
-0.24
0.15
60.0%
2004
-0.15
0.49
1.11
28.7%
2005
0.30
1.08
1.89
8.0%
2006
-0.03
0.73
1.48
17.9%
The sample was restricted to U.S firms larger than 300 million dollars with EDF credit measures equal to 35%.
EDF buckets
EDF Range
Correlation
Number
of Firms
0.015-
0.191
22887
0.177
1264
0.192
442
512
1235
Figures 12, 13, and 14 show the median, mean, and the prediction interval for the realized default rate and actual default
rate for EDF values in the range [0.02, 5), [5,12), and [12,35), respectively. It is clear from these figures that while the
predicted and realized default rates can deviate from each other in certain years, there is no substantial bias in their levels
over the long run. In general, the two levels track each other very well. All predicted default rates fall within the
27
prediction interval. Year 2003 was an uncharacteristically good year for the economy leading to a substantially lower
number of defaults in two of the three subgroups.
FIGURE 12
Comparison of median predicted default rate with the realized default rate, 1991- 2006
The sample was restricted to U.S. firms larger than 300 million dollars and EDF credit measure between 0.01% and 5%. We used an
asset correlation of 0.191 to simulate defaults in each year. On the left panel, the gray lines represent the 80% prediction interval for
realized default rate, the black line is the median predicted default rate, the blue line is the mean predicted default rate, and the red
dotted line is the realized default rate. The right panel shows the aggregate shock distribution and P-values. The dark black line, rm50
is the median for the posterior distribution of the aggregate shock; the grey lines, rm10 and rm90 are the 10th and 90th percentiles; the
blue line is the P-value of the actual default rate, which is the probability of observing a default at or lower than the actual default rate.
FIGURE 13
Comparison of median predicted default rate with the realized default rate, 1991 - 2006
The sample was restricted to U.S. firms larger than 300 million dollars and EDF credit measure between 5% and 12%. We used an
asset correlation of 0.177 to simulate defaults in each year. On the left panel, the gray lines represent 80% prediction interval for
predicted default rate, the black line is the median predicted default rate, the blue line is the mean predicted default rate, and the red
dotted line is the realized default rate. The right panel shows the posterior distribution of the aggregate shock and P-values. The dark
black line, rm50 is the median for the posterior distribution of the aggregate shock; the grey lines, rm10 and rm90 are the 10th and 90th
28
percentiles; the blue line is the P-value of the actual default rate, which is the probability of observing a default at or lower than the
actual default rate.
FIGURE 14
Comparison of median predicted default rate with the realized default rate, 1991- 2006
The sample was restricted to U.S. firms larger than 300 million dollars and EDF credit measure between 12% and 34.99%. We used
an asset correlation of 0.192 to simulate defaults in each year. On the left panel, the gray lines represent the 80% prediction interval
for realized default rate, the black line is the median predicted default rate, the blue line is the mean predicted default rate, and the red
dotted line is the realized default rate. The right panel shows the aggregate shock distribution and P-values. The dark black line, rm50
is the median for the posterior distribution of the aggregate shock; the grey lines, rm10 and rm90 are the 10th and 90th percentiles; the
blue line is the P-value of the actual default rate, which is the probability of observing a default at or lower than the actual default rate.
29
FIGURE 15
Comparison of median agency ratings with Moodys KMV EDF values for defaulted firms
from two years before default to one year after default for North American companies outside the
U.S. and sample period between 1996 and 2006
13
Size is measured by the sales of the firm for non-financial firms. Whenever the firms total sales number was not available, we used
the book asset value of the firm. This number was further adjusted for inflation effect across years by adjusting the numbers to a
common denomination by using a deflation adjustor calculated internally at Moodys KMV.
30
FIGURE 16
Cumulative Accuracy Performance (CAP) curves comparing EDF credit measures,
Merton Default Probability and Z-Scores between 1996 and 2006 for North American companies
outside the U.S. The Accuracy Ratios for the EDF credit measure, Merton Default Probability and
Z-Score are 0.78, 0.70 and 0.65 respectively.
31
FIGURE 17
Comparison of median predicted default rate with the realized default rate, 19912006
The sample was restricted to North American firms outside the U.S. larger than 300 million dollars and EDF credit measure less than
35%. We used an asset correlation of 0.19 to simulate defaults in each year. On the left panel the gray lines represent 80% prediction
interval for the realized default rate, the black line is the median predicted default rate, the blue line is the mean predicted default rate
and red dotted line is the realized default rate. The right panel shows the posterior distribution of the aggregate shock and P-values.
The dark black line, rm50 is the median of the aggregate shock; and the grey lines, rm10 and rm90, are the 10th and 90th percentiles
for the aggregate shock; the blue line is the P-vale of the actual default rate, which is the probability of observing a default at or lower
than the actual default rate.
4.1.8 Conclusion
Results obtained for the North American sample show that the EDF credit measure leads the agency rating in timely
default prediction. The EDF credit measure leads other alternative measures in its ability to discriminate good firms from
bad firms over time and across various subsections of the data. We also showed that the model predicted default rates
track realized default rates well and the model works well not only in the U.S., but also in North America excluding the
U.S.
4.2
Europe
14
A brief description of the similarities and differences among British, French, and German bankruptcy mechanisms is provided in
Korablev (2005).
32
A second characteristic is the nature of debt in an economy. A creditor-debtor relationship might be close (as in Japan),
or at arms length (as in the U.S.). If the creditors are few and have a close relationship with the debtor, they are more
likely to evaluate the long-term potential of the debtor before taking it into bankruptcy. If the creditors are scattered,
there is a higher likelihood of a free-rider problem, leading to a forced bankruptcy even if the debtor may have some
long-term positive potential.
In general, we see an equal contribution of non-bankruptcy defaults and bankruptcies in North America, while the
15
European cases of distress are dominated by bankruptcies as shown in Figure 18. This may be influenced by two
factors. First, in many economies within Europe, the debt is held more closely relative to that in the U.S., making it
more likely to enter private renegotiations of debt and avoid default during times of a liquidity crunch. Second, many
cases of defaults may not be covered by the media, and are in that sense hidden. These two factors should not be
applicable to larger firms because their debt is usually widely held, and they are followed more closely by media.
Figure 19 compares the percentage representation of default cases in Europe and North America by size over the period
of 19962006. Defaults as a fraction of total distress cases are substantially smaller in Europe for small and mid-sized
firms. Larger firms, however, have more comparable default behavior across Europe and North America. This shows that
the model validation is more reliable on the sample of large companies because of the quality of data on actual defaults.
North America
Europe
100%
100%
80%
80%
60%
60%
40%
40%
20%
20%
0%
0%
1996
1997
1998
1999
2000
2001
Bankruptcy
FIGURE 18
2002
2003
2004
2005
1996
2006
1997
1998
1999
2000
2001
Bankruptcy
Defaults
2002
2003
2004
2005
2006
Defaults
Europe
80%
70%
70%
60%
60%
50%
50%
40%
40%
30%
30%
20%
20%
10%
10%
0%
0%
1996
1997
1998
1999
FIGURE 19
2000
2001
2002
2003
2004
2005
2006
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
Default events as a percentage of all distress cases across three size buckets
between 1996 and 2006
15
The following events constitute non-bankruptcy defaults: missed interest or principle payment, distressed extension of a loan,
distressed exchange offer, delay in paying substantial portion of trade debt, and government takeover of financial institution to prevent
market collapse.
33
The success of a model relies on the ability of the inputs to take regional nuances into account. A model whose inputs are
not universal in concept may have more difficulty capturing the differences in characteristics of the system in which it is
being implemented. As long as the economic fundamentals of a model are universal in nature, it is not necessary to
interpret its output differently across different regions. For the Moodys KMV EDF model, one of the main drivers is
asset value, which is inferred from the equity value and an underlying structural framework. The model should work well
for data from individual regions and for data pooled across them because the equity markets take into account the
regional differences.
The extent to which different equity markets accurately reflect firm value and volatility has implications for the power
and the level performances of the model. In fact, even if a model is powerful in discriminating defaulters from
non-defaulters in different regions, but is off in its level performance, the aggregation of data across regions will make the
model seem less powerful. For example, if a distance-to-default (DD) of 2 corresponds to an EDF credit measure of 5%
in the U.K., but 2% in France, then an aggregation of data would incorrectly suggest that both a U.K. firm and a French
firm with a DD of 2 correspond to the same rank in our test. In that sense, a default predictive power test on a dataset
aggregated across different regions essentially tests a joint hypothesis that the model is powerful and that the DD-to-EDF
mapping is similar across different regions. It could be the case that the model might be powerful in two regions
separately, but may appear less powerful if the data are aggregated.
Similarly, while testing for levels, one could imagine that the model had specified levels in two regions incorrectly,
overestimating the default rate in one region and underestimating it in the other. However, it may work well on the
aggregated dataset. Therefore, a reasonable level performance on aggregated data is a necessary, but not a sufficient, test
for the level performance of the model in each region. Unfortunately, there is an insufficient number of defaults available
to perform a reliable level test in each subregion of Europe.
4.2.2 Data
We start with all European firms that have publicly traded equity between 1996 and 2006. The sample was then
16
restricted to non-financial firms with more than $30 million in size to avoid missing and hidden default problem. For
level validation we imposed a further restriction of $300 million in size.
16
Following our practice in North America, size is measured by the sales of the firm for non-financial firms. Whenever the firms total
sales number was not available, we used the book asset value of the firm. This number was further adjusted for inflation effect across
years by adjusting the numbers to a common denomination by using the appropriate consumer price index and exchange rate.
34
TABLE 11
Size >=
$30 Million
Size >=
$300 Million
Austria
AUT
112
60
Belgium
BEL
134
77
Switzerland
CHE
232
153
Czech Republic
CZE
75
30
Germany
DEU
796
394
Denmark
DNK
171
77
Spain
ESP
177
124
Finland
FIN
153
81
France
FRA
897
413
Great Britain
GBR
1959
788
Greece
GRC
256
66
Hungary
HUN
37
15
Ireland
IRL
76
37
Iceland
ISL
Israel
ISR
136
48
Italy
ITA
301
172
Luxemburg
LUX
31
23
Netherlands
NLD
238
152
Norway
NOR
232
90
Poland
POL
118
43
Portuguese
PRT
90
37
Russia
RUS
61
57
Slovakia
SVK
16
Slovenia
SVN
Sweden
SWE
311
134
Turkey
TUR
165
55
Country
We also present the results for level validation for subsample of countries that have more than 100 companies of size
$300 million. These countries include Switzerland, Germany, Spain, France, Great Britain, Italy, Netherlands, and
Sweden. The number of firms by country and size is shown in Table 11.
35
Defaults are based on the Moodys KMV Default database and include missed payments, distressed exchanges and
17
insolvency proceedings. For all comparisons against agency ratings we used Moodys ratings.
FIGURE 20
Median agency ratings and Moodys KMV EDF values for rated defaulted firms in Europe
from 24 months before default to 10 months after default between 1996 and 2006
17
To collect defaults, we use numerous printed and online sources from around the world on a daily basis. We use government
fillings, government agency sources, company announcements, news services, specialized default news sources and even sources within
financial institutions to ensure, to the greatest extent possible that we find all defaults. We also keep evidences in electronic format so
that content can be easily verified. As a result, Moodys KMV has the most extensive default database for public firms.
36
We divide the sample into subsets of sizes $30 million to $300 million, and $300 million and above. In both cases the
EDF credit measure outperforms the Merton model implied default probability and Z-Score, as shown in Table 10. All
the measures improve for larger firms.
As a robustness check, we compared the performance of the three measures across time horizons 19962000 and
20012006. The results, presented in Table 10, illustrate that the EDF credit measure outperforms the Merton model
and Z-Score with EDF credit measure and Merton default probability performing better in 19962000 period while
Z-Score has higher Accuracy Ratio in the second period.
FIGURE 21
Cumulative Accuracy Profile curves (CAP) comparing Moodys KMV EDF credit measures,
Z-Scores and Merton default probabilities for European non-financial firms between 1996 and 2006.
The Accuracy Ratios for EDF measure, Z-Score and Merton default probability are 0.79, 0.61 and 0.70,
respectively.
We summarize our findings in this section in Table 12. The results clearly show that the EDF credit measure in Europe
outperforms the other popular alternative in its ability to discriminate good firms from bad firms at a 1-year horizon.
37
TABLE 12
Summary of Accuracy Ratios, across various size buckets and time periods for
European non-financial firms
EDF Credit
Measure
Z-Score
Simple Merton
Model
19962006,
Size >$30 Million
0.79
0.61
0.70
19962000
Size >$30 Million
0.79
0.53
0.71
20012006
Size >$30 Million
0.78
0.64
0.64
19962006,
Size between $30$300 Million
0.75
0.60
0.64
19962006,
Size>$300 Million
0.83
0.65
0.77
Date
38
FIGURE 22 Comparison of median predicted default rate with the realized default rate, 19962006
The sample was restricted to European non-financial firms larger than 300 million dollars. We used an asset correlation of 0.25 to
simulate defaults in each year. On the left panel, the gray lines represent the 80% prediction interval for realized default rate, the black
line is the median predicted default rate, the blue line is the mean predicted default rate, and the red dotted line is the realized default
rate. The right panel shows the aggregate shock distribution and P-values. The dark black line, rm50 is the median for the posterior
distribution of the aggregate shock; the grey lines, rm10 and rm90 are the 10th and 90th percentiles; the blue line is the P-value of the
actual default rate, which is the probability of observing a default at or lower than the actual default rate.
We summarize the numbers that underlie Figure 22 in Tables 12 and 13. Table 12 contains number of firms, number of defaults,
median and mean predicted default rate per year, as well as 10th and 90th percentiles for predicted default rate. We find that the mean
predicted default rates are much larger than the median default rates indicating that the correlation effect skews the distribution of
default rates to the left. If we ignored this effect and had simply taken the mean default rate of the sample, we would have falsely
concluded that the model over predicts defaults. Table 13 contains the median aggregate shock, 10th and 90th percentiles of the
aggregate shock and the P-value by year.
39
TABLE 13
Comparison of mean and median predicted number of defaults with the realized
number of defaults between 1996 and 2006
Mean
Predicted
Default
Rate
Median
Predicted
Default
Rate
1996
0.87%
0.40%
1997
0.90%
1998
Realized
Default
Rate
10th
Percentile
90th
Percentile
0.44%
0.00%
2.00%
1596
0.50%
0.37%
0.00%
2.10%
1610
0.67%
0.30%
0.38%
0.00%
1.40%
1588
1999
0.98%
0.50%
0.35%
0.00%
2.20%
1692
2000
0.92%
0.50%
0.38%
0.00%
2.10%
1580
2001
1.34%
0.80%
0.88%
0.10%
3.10%
1360
12
2002
1.92%
1.20%
1.70%
0.20%
4.40%
1409
24
2003
3.09%
2.20%
0.66%
0.50%
6.80%
1513
10
2004
1.65%
1.00%
0.77%
0.20%
3.80%
1563
12
2005
1.14%
0.70%
0.31%
0.10%
2.60%
1627
2006
0.47%
0.20%
0.06%
0.00%
0.80%
1607
Year
Firms
Defaults
The sample was restricted to European firms larger than 300 million dollars.
TABLE 14
Year
10th Percentile
Median
Aggregate Shock
90th Percentile
1996
-0.30
0.04
0.37
56.09%
1997
-0.16
0.21
0.57
47.61%
1998
-0.44
-0.07
0.28
60.46%
1999
-0.08
0.28
0.62
46.15%
2000
-0.18
0.18
0.53
48.31%
2001
-0.37
-0.08
0.21
55.76%
2002
-0.52
-0.29
-0.06
64.51%
2003
0.70
1.02
1.32
17.02%
2004
-0.04
0.26
0.54
42.95%
2005
0.15
0.54
0.92
38.48%
2006
-0.05
0.57
1.17
49.24%
The sample was restricted to the European firms larger than 300 million dollars.
40
Probability of having
actual defaults or
even lower
Results for Countries having at Least 100 Companies with Size Greater than $300 Million
We restricted the sample to the countries that have at least 100 companies with size greater than $300 million. These
countries tend to have larger equity markets. For these companies, the predicted default rate tracks the realized default
rate very well as shown in Figure 23. The relatively low default rate in year 2003 is indicative of a large positive shock.
The P-values of the realized default rate range from 20% to 69%, which is within the sampling variability that would be
expected.
FIGURE 23
Comparison of median predicted default rate with the realized default rate, 19962006
The sample was restricted to European non-financial firms larger than 300 million dollars from the following countries: Switzerland,
Germany, Spain, France, Great Britain, Italy, Netherlands, and Sweden. We used an asset correlation of 0.25 to simulate defaults in
each year. On the left panel, the gray lines represent the 80% prediction interval for realized default rate, the black line is the median
predicted default rate, the blue line is the mean predicted default rate, and the red dotted line is the realized default rate. The right
panel shows the aggregate shock distribution and P-values. The dark black line, rm50 is the median for the posterior distribution of
the aggregate shock; the grey lines, rm10 and rm90 are the 10th and 90th percentiles; the blue line is the P-value of the actual default
rate, which is the probability of observing a default at or lower than the actual default rate.
41
12
Caa EDF implied rating categories and found comparable results. The results are shown in Figure 25, and 26
respectively. There was some overlap in the underlying names in the two currencies. However, our findings are robust to
using a completely non-overlapping sample as well. The subinvestment names can be impacted by liquidity risk that can
be different in different regions, thereby making the test less reliable.
Aa and above
FIGURE 24
Comparison of CDS spreads in the U.S. and Europe for Aa and above and A EDF-implied
rating categories
Blue lines represent 25th, median and 75th percentile of the CDS spread in Europe and red lines are similar data for the U.S.
12
The category Aaa is not shown because there were very few observations for CDS contracts in this category.
42
Baa
FIGURE 25
Ba
Comparison of CDS spreads in the U.S. and Europe for Baa and B EDF-implied rating
categories
Blue lines represent 25th, median and 75th percentile of the CDS spread in Europe and red lines are similar data for the U.S.
FIGURE 26
Caa
Comparison of CDS spreads in the U.S. and Europe for B and Caa EDF implied rating
categories
Blue lines represent 25th, median and 75th percentile of the CDS spread in Europe and red lines are similar data for the U.S.
4.2.7 Conclusion
We showed that in Europe, EDF credit Measures lead Agency Ratings in timely default prediction. EDF credit measures
lead other alternative measures in their ability to discriminate good firms from bad firms over time and across various
43
subsections of the data. Model-predicted default rates track realized default rates well and CDS spreads are similar to
those in the U.S. for the same EDF-implied rating categories.
4.3
Asia
4.3.1 Data
We start with all Asian firms that have publicly traded equity from 1996 to 2006. We restrict the sample to nonfinancial firms with more than $30 million in size (unless otherwise specified) to account for hidden or missing
18
defaults. Defaults are based on the Moodys KMV Default database and include missed payments, distressed exchanges,
and insolvency proceedings.
Table 14 shows the number of companies by country for two size categories: above $30 million and above $300 million
that are in Asian module of Credit Monitor and CreditEdge.
We decided to exclude some countries from level validation:
Australia and New Zealand, because they belong to the Pacific region
Japan, because it has a different economic structure and a hidden default problem
Pakistan and Sri Lanka, because they have a small number of companies
The remaining countries have the most comprehensive default coverage. These countries are Hong Kong, India,
Indonesia, Korea, Malaysia Philippines, Singapore, Thailand, and Taiwan. We ran power and level validation tests
separately for Japan.
18
Size is measured by the sales of the firm for non-financial firms. Whenever the firms total sales number was not available, we used
the book asset value of the firm. This number was further adjusted for inflation effect across years by adjusting the numbers to a
common denomination by using a deflation adjustor calculated internally at Moodys KMV.
44
TABLE 15
Size >=
$30 Million
Size >=
$300 Million
Australia
AUS
844
258
China
CHN
1357
385
Hong Kong
HKG
695
231
Indonesia
IDN
200
64
India
IND
567
174
Japan
JPN
3955
2274
Korea
KOR
899
377
Sri Lanka
LKA
19
Malaysia
MYS
643
135
New Zealand
NZL
101
47
Pakistan
PAK
83
24
Philippines
PHL
96
25
Singapore
SGP
494
129
Thailand
THA
360
79
Taiwan
TWN
1196
310
Country
45
FIGURE 27 Median agency ratings and Moodys KMV EDF values for all rated defaulted firms in Asia from
24 months before default to 10 months after default between 1996 and 2006. EDF values are displayed
on log scale.
46
FIGURE 28 Cumulative Accuracy Profile (CAP) curves comparing Moodys KMV EDF credit measures and
Z-Scores for Asian non-financial companies between 10/2001 and 12/2006. The Accuracy Ratios for EDF
measure, Z-Score and Merton Default Probabilities are 0.67, 0.57 and 0.56, respectively.
The EDF credit measure has more discriminatory power than Z-Score and Merton Default Probability in Japan.
Consistent with the results in other nine countries, Z-Score has higher Accuracy ratio than Merton default probability.
CAP curves are presented in Figure 29. Accuracy ratios of EDF credit measure, Merton default probability and Z-Score
are 0.89, 0.79 and 0.77, respectively.
47
FIGURE 29 CAP curves comparing Moodys KMV EDF credit measures and Z-Scores for Japanese nonfinancial companies between 10/2001 and 12/2006. The Accuracy Ratios for the EDF credit measure,
Z-Score, and Merton Default Probabilities are 0.89, 0.79 and 0.77, respectively.
48
FIGURE 30
Comparison of median predicted default rate with the realized default rate, 19962006
The sample was restricted to Asian non-financial firms larger than 300 million dollars from the following countries: Hong Kong,
India, Indonesia, Korea, Malaysia Philippines, Singapore, Thailand, and Taiwan. We used an asset correlation of 0.25 to simulate
defaults in each year. On the left panel, the gray lines represent 80% prediction interval for predicted default rate, the black line is the
median predicted default rate, the blue line is the mean predicted default rate, and the red dotted line is the realized default rate. The
right panel shows the posterior distribution of the aggregate shock distribution and P-values. The dark black line is the median
aggregate shock; the grey lines, rm10 and rm90 are the 10th and 90th percentiles for the aggregate shock; the blue line is the P-value of
the actual default rate, which is the probability of observing a default at or lower than the actual default rate.
49
FIGURE 31
Comparison of median predicted default rate with the realized default rate,19962006
The sample was restricted to Japanese non-financial firms larger than 300 million dollars. We used an asset correlation of 0.25 to
simulate defaults in each year. On the left panel, the gray lines represent the 80% prediction interval for realized default rate, the black
line is the median predicted default rate, the blue line is the mean predicted default rate, and the red dotted line is the realized default
rate. The right panel shows the aggregate shock distribution and P-values. The dark black line, rm50 is the median for the posterior
th
th
distribution of the aggregate shock; the grey lines, rm10 and rm90 are the 10 and 90 percentiles; the blue line is the P-value of the
actual default rate, which is the probability of observing a default at or lower than the actual default rate.
4.3.5 Conclusion
We showed that in Asia, the EDF credit measures lead agency ratings in timely default prediction. For countries where
we have best default coverage, EDF credit measures lead other alternative measures in their ability to discriminate good
firms from bad firms over time and across various subsections of the data. Realized default rate for countries with better
default coverage lies within the prediction interval. In Japan, the EDF model discriminates distressed firms from healthy
firms very well.
4.4
Four panel graphs in Figure 32 display the median EDF credit measure by rating categories across different regions.
Levels of EDF credit measure for North American non-financial companies, Asia-Pacific non-financial companies, and
global financial companies are comparable for all rating categories. Levels in Europe are a bit lower for better quality
firms in rating categories of A and Baa.
50
100%
100%
Ba
Baa
Ba
Baa
10.0%
EDF8
EDF8
10.0%
1.00%
0.10%
1.00%
0.10%
0.01%
0.01%
01M90
01M95
01M00
01M05
01M10
01M90
01M95
01M00
01M05
100%
100%
1.00%
0.10%
Ba
Baa
10.0%
EDF8
EDF8
Ba
Baa
10.0%
1.00%
0.10%
0.01%
0.01%
01M90
01M95
01M00
01M05
FIGURE 32
01M10
01M10
01M90
01M95
01M00
01M05
01M10
CONCLUSION
In this document, we tested the performance of the Moodys KMV EDF credit measure in its timeliness of default
prediction, ability to discriminate good firms from bad firms, and accuracy of levels. Whenever possible, we compared
the performance to other popular alternatives available to the market.
We find that the EDF credit measure performed well on all counts over time and across various subsections of the data.
We also showed that the Moodys KMV model works well not only in North America, but also in Europe and Asia. In
Europe, our findings are especially significant because our European sample consisted of various subregions with
substantially different debt-holding structures and bankruptcy mechanisms which could have adversely impacted the
results if the model was not universal in concept.
While research at Moodys KMV continues to make efforts to make this measure superior and take into account all the
nuances of the data as the markets evolve and become more complex, we feel that as of now, this measure sets a standard
in the industry for a transparent and predictive absolute measure of the probability of default.
51
As a specific case, let us think of a sample that has N defaulters and M non-defaulters. The ith firm in the sample is
assigned a default probability pi. Without loss of generality, let us assume the order to be p1 p2p3.pM+N.
Therefore, for each pi, one can assign a set (m(pi),n(pi)), where m(pi) represents the number of non-defaulters that have
probability of default greater than or equal to pi, and n(pi) represents the number of defaulters that have probability of
default greater than or equal to pi. Obviously m(pM+N) = M and n(pM+N) = N. One can translate these numbers to fraction
of defaulters and non-defaulters as fm(pi) and fn(pi) where fm(pi) = m(p i ) represents the fraction of non-defaulters that have
M
a default probability greater than or equal to pi, and fn(pi) = n (p i ) represents the fraction of non-defaulters that have a
N
default probability greater than or equal to pi. Similarly, one can also create an overall fraction
f(pi) = m(p i ) + n (p i ) that represents the fraction of firms in the sample that have a default probability greater than or equal
M+N
to pi.
CAP is now defined as the graph of fn(pi) against f(pi) for all values of pi. ROC is the graph of fn(pi) against fm(pi) for all
values of pi. Alternatively, ROC can also be interpreted as the curve that plots the hit rate against the false alarm rate for
any cut-off C, across all values of C.
Receiving Operating Characteristics is a popular approach borrowed from medical science. ROC curves, also known as
power curves, are well-known ways of establishing the ability of a model to distinguish signals from noise, or in our case,
defaulters from non-defaulters. The basic approach takes a sample of M+N firms, of which M are good firms (nondefaulters) and N are defaulted firms. If we rank the firms in their likelihood of defaulting from the highest potential
defaults to lowest potential defaults, and exclude z% of the riskiest firms from the sample, then we will end up excluding
some actual defaults and some non-defaults. In this way, we end up excluding z/100*(M+N) of the firms.
A perfect model would have excluded all defaults, as long as z/100 is less than N/(M+N). A random model with no
information would exclude zM/100, i.e., z% of non-defaults and zN/100 (i.e., x% of defaults). Let us assume that we
exclude x(z)% of non-defaults, and y(z)% of defaults by excluding the riskiest z% of the sample. By varying z from 0 to
100, we can get various pairs of (x,y). By plotting y against x on an X-Y plot, we should be able to construct a graph that
indicates the Accuracy Ratio of the model. For example, a model with no predictive power should have its (x,y) plot as a
45 degree line. A perfect model should be a flat horizontal line at 100%. Note that both x and y will vary from 0 to
100%. Also, this test needs only ordinal ranking and therefore can be used to test all the various approaches of credit risk
on the same plane.
The area of the CAP curve above the 45-degree line as a fraction of the area of the perfect models CAP curve above the
45-degree line is called the Accuracy Ratio (AR). The area under the ROC curve is called the AUC (Area Under Curve).
This is illustrated in figures 33 and 34. Both of these figures are based on the same population of non-defaulters and
defaulters. Figure 33 shows the Cumulative Accuracy Profile Curve that is the curve outside area A. A perfect model
would have the Cumulative Accuracy Curve represented by the straight line outside area B. The Accuracy Ratio is
A/(A+B). Figure 34 shows the ROC curve outside the shaded area. The shaded area represents the AUC. It can be shown
that 2AUC-1 = AR. For more details and proof of this relationship, refer to Engelmann, Hayden, and Tasche (2003).
19
Defaulters are usually counted over a certain horizon. Therefore these tests are horizon specific. In this document, all tests are for 1year horizon.
52
FIGURE 33
FIGURE 34
53
Ratings
1996
(EDF: 12/95; Defaults: 1/96-12/96)
0.72
0.80
1997
(EDF: 12/96; Defaults: 1/97-12/97)
0.91
0.80
1998
(EDF: 12/97; Defaults: 1/98-12/98)
0.90
0.76
1999
(EDF: 12/98; Defaults: 1/99-12/99)
0.85
0.75
2000
(EDF: 12/99; Defaults: 1/00-12/00)
0.76
0.67
2001
(EDF: 12/00; Defaults: 1/01-12/01)
0.74
0.65
2002
(EDF: 12/01; Defaults: 1/02-12/02)
0.79
0.58
2003
(EDF: 12/02; Defaults: 1/03-12/03)
0.86
0.77
2004
(EDF: 12/03; Defaults: 1/04-12/04)
0.85
0.84
2005
(EDF: 12/04; Defaults: 1/03-12/03)
0.89
0.75
2006
(EDF: 12/05; Defaults: 1/06-12/06)
0.96
0.82
Date
54
TABLE 17 Accuracy Ratios for EDF Credit Measures and agency ratings
for U.S non-financial companies by year at 5-year Horizon
Date
1991
(EDF: 12/90; Defaults: 1/91-1/96)
1992
(EDF: 12/91; Defaults: 1/92-1/97)
1993
(EDF: 12/93; Defaults: 1/93-1/98)
1994
(EDF: 12/93; Defaults: 1/94-1/99)
1995
(EDF: 12/95; Defaults: 1/95-1/00)
1996
(EDF: 12/95; Defaults: 1/96-1/01)
1997
(EDF: 112/96; Defaults: 1/97-1/02)
1998
(EDF: 12/97; Defaults: 1/98-1/03)
1999
(EDF: 12/98; Defaults: 1/99-1/04)
2000
(EDF: 12/99; Defaults: 1/00-1/05)
2001
(EDF: 12/00; Defaults: 1/01-1/06)
2002
(EDF: 12/01; Defaults: 1/02-1/07)
EDF Credit
Measure
Ratings
0.77
0.73
0.84
0.81
0.79
0.76
0.67
0.66
0.68
0.58
0.68
0.62
0.66
0.62
0.67
0.64
0.65
0.65
0.61
0.64
0.65
0.61
0.71
0.60
We used EDF data up until 12/2002 and default data up until 12/2006.
55
REFERENCES
1.
Irina Korablev, 2005, Power and Level Validation of the EDF Credit Measure in the European Market.
2.
Jeff Bohn, Navneet Arora, & Irina Korablev, 2005, Power and Level Validation of the Moodys KMV EDF
Credit Measure in the U.S. Market.
3.
Agrawal, Deepak, Navneet Arora, and Jeffrey Bohn, 2004, Parsimony in Practice: An EDF-based Model of
Credit Spreads, Moodys KMV White Paper.
4.
Arora, Navneet, Jeffrey Bohn, and Fanlin Zhu, 2005, Reduced vs. Structural Models of Credit Risk: A Case
Study of Three Models, Moodys KMV Technical Document.
5.
Crosbie, Peter, and Jeffrey Bohn, 2003, Modeling Default Risk, Moodys KMV Technical Document,.
6.
Das, Ashish, Amnon Levy, Anil Gurnaney, Jeffrey Bohn, Peter Crosbie and Stephen Kealhofer, 2004,
Modeling Portfolio Risk, Moodys KMV Technical Document.
7.
Douglas W. Dwyer & Shisheng Qu, 2007, EDF 8.0 Model Enhancements.
8.
Douglas W. Dwyer, 2007, The Distribution of Defaults and Bayesian Model ValidationJournal of Model Risk
Validation, Volume 1, no 1.
9.
Engelmann, Berndt, Evelyn Hayden, and Dirk Tasche, 2003, Testing Rating Accuracy, Risk, January 2003.
10. Eom, Young Ho, Jean Helwege, and Jing Zhi Huang, 2003, Structural Models of Corporate Bond Pricing:
An Empirical Analysis, Review of Financial Studies.
11. Hull, John, 1999, Options, Futures and Other Derivatives, Prentice Hall Publications, Fourth Edition.
12. Johnson, Norman, Samuel Kotz, and Adrienne Kemp, 1993, Univariate Discrete Distributions, 2nd Ed., NY:
Wiley.
13. Kurbat, Matt, and Irina Korablev, 2002, Methodology for Testing the Level of the EDF Credit Measure,
Moodys KMV White Paper.
14. Lyden, Scott, and David Saraniti, 2000, An Empirical Analysis of Classical Theory of Corporate Security
Valuation, Research Paper, Barclays Global Investors.
56