Вы находитесь на странице: 1из 63

1

How Well Does the Vasicek-Basel AIRB Model Fit the Data?
Evidence from a Long Time Series of Corporate Credit
Ratings Data
by

Paul H. Kupiec
-

Preliminary
September 2009

EXTENDED ABSTRACT
The Basel II AIRB framework uses Vasiceks asymptotic single factor model to set minimum
regulatory capital requirements for bank credit risk. I develop an estimation approach that
produces consistent estimates of the parameters in the Vasicek model as well as consistent
estimates of the common macro factor realizations that drive credit defaults. The model is
estimated using Moodys data on the default rates of rated corporate credits over the period 1920-
2008. Model fit is assessed using robust statistics and small samples estimation issues are
examined. The treatment of observations with recorded default rates of zero is identified as an
important estimation issue that requires further study.

The Vasicek-Basel II AIRB default model is not capable of reproducing the observed variation in
Moodys corporate default rate data. The analysis shows that the true correlation among default
patterns for different credit grades is inconsistent with the Vasicek single common factor model
specification. Over the period 1920-2008, corporate rated credits either do not default or default
at much higher frequencies than are predicted by the model. In contrast to the models
assumptions, the macro factor that drives corporate defaults exhibits strong positive
autocorrelation. Observed credit cycles last many years on average. As a consequence, long time
series are required to produce reliable model parameter estimates. It is impossible, for example,
to produce reliable estimates of unconditional probability of default from sample as short as 5- or
10-years. In contrast to the Basel AIRB framework, highly-rated corporate credits exhibit much
higher default rate correlations compared to lower-rated credits.

I use the econometric model to develop procedures to correct for business cycle effects when
estimating unconditional default rates when a credit class has only a limited sample of default
data. These procedures are useful not only for estimation, but also for back testing or attempting
to validate a banks Basel AIRB parameter assignments.

-
Federal Deposit Insurance Corporation. The views expressed are those of the author and do not
reflect the views of the FDIC. I am grateful to Ed Kane and Matt Pritsker for comments on an
early draft of thisstudy. Email: pkupiec@fdic.gov
2

HOW WELL DOES THE VASICEK-BASEL AIRB MODEL FIT THE DATA?
EVIDENCE FROM A LONG TIME SERIES OF CORPORATE CREDIT RATINGS DATA

I. INTRODUCTION
The Basel II Advanced Internal Ratings-Based (AIRB) framework is used to set
minimum regulatory capital requirements of the largest, most sophisticated
internationally-active banks. For example, the Financial Stability Institute (2006) reports
that 95 countries plan to implement Basel II by 2015, and more than 60 percent of the
Basel II adopters plan on including the AIRB option for credit risk capital requirements.
1

The AIRB regulatory framework uses an asymptotic version of Vasiceks (1987)
portfolio credit loss model to approximate the annual default rate distributions on
portfolios of credits that are differentiated by a bank-assigned credit rating. The AIRB
framework uses the Vasicek default rate distribution and estimates of loss given default
(LGD) and exposure at default (EAD) to approximate the credit loss distribution for each
credit grade portfolio.
2
Regulatory capital requirements are then set equal to the 99.9
percent upper-tail critical value of a credit grades potential portfolio loss distribution.
Much has been written about Basel II, but few if any studies have formally
analyzed how well the Basel II Advanced Rating Based (AIRB) model fits credit default
data produced by portfolios of credits categorized under a consistent credit rating system.
This paper develops a new approach for estimating the parameters of the Basel AIRB

1
The 60 percent figure includes both Basel Committee members and nonmember
countries that plan on adopting the AIRB approach as an option for credit risk regulatory
capital calculations.
2
The LGD and EAD estimates are not part of the Vasicek model. They are estimated
independently of the parameters of the Vasicek model.
3
model and new techniques for assessing the model fit relative to historical default rate
data on corporate credits rated by Moodys Investors Services.
The approach I propose uses panel regression methods to estimate the Vasicek-
Basel AIRB model parameters using time series data on a cross section of failure rates
from a consistent credit rating system. The methodology produces consistent estimates
of the unconditional default rates associated with each credit grade by correcting for
business cycle effects that are modeled using a single common factor. A consistent
estimate of the AIRB correlation parameter is derived directly from the default rate data.
The methodology also produces estimates of the common macro factor that is assumed to
drive default correlations.
3
This common factor can be used to control for
macroeconomic effects when circumstances require an estimate of the unconditional
default rates for additional credit grades for which only brief time series histories are
available.
The ability to identify the common market factor and to control for its impact on
observed default rate realizations is particularly useful for back testing and model
validation analysis when true default rates (and thus the common market factor
realizations) are autocorrelated. The methodology allows the researcher to estimate or
test the unconditional default rate associated with a rating grade while correcting for the
sample dependent common factor realization. This approach offers a significant

3
The methodology can also be used to estimate multiple correlation parameters if the
data includes a sufficiently large number of credit grades.
4
improvement over tests that ignore time dependence in default rate realizations and treat
sample default rate estimates as unconditional mean default rate values.
4

The proposed estimation methodology is implemented using default rate data
from Moodys Investors Services on rated corporate bond issues from 1920-2008. The
data are used to derive consistent estimates for the unconditional default rates associated
with the Aa, A, Baa, Ba, B and CaaC rating categories as well as for the Vasicek default
correlation parameter.
The estimation exercise highlights the importance of zero default rate
observations in historical data. Under the asymptotic Vasicek-Basel AIRB model
assumptions, zero default rates should almost never occur, and yet one-third of the
Moodys sample reports a zero annual default rate. A reported default rate of zero can
arise because a portfolio is not truly asymptotic and so the reported default rate is
downward biased as a consequence of measurement error. Estimates can be adjusted to
account for a reasonable upper bound on the magnitude of the measurement error
associated with zero default rate observations, but the resulting model parameter
estimates are very sensitive to the treatment accorded zero default rate observations.
An important issue related to model parameter estimation is the behavior of
unconditional default parameter estimates from small samples. Because of the assumed
importance of the macro factor, default realizations are driven by market conditions
which must be controlled for when estimating the unconditional default rates associated
with a rating grade. While the Vasicek model specification assumes that the macro factor

4
Unconditional default rate tests include those proposed by Cantor and Falkenstein
(undated copy), Pluto and Tasche (undated copy), Cantor Hamilton and Tennant (2007)
and Schuermann and Hanson (2004).
5
realizations are independent across time but in reality they have strong positive
autocorrelation. This positive autocorrelation makes it impossible to estimate
unconditional default rates from small samples unless the common factor effects are
properly controlled for in the estimation process using estimates of the common factor
realizations derived from external data.
The econometric model specification provides an intuitive process for correcting
for the aforementioned small sample bias in unconditional default rates. If consistent
estimates of the Vasicek common factor realizations and correlation parameter can be
recovered from a long time series panel data set of default rates, these estimates can be
used to construct consistent estimates of the unconditional default rates for auxiliary
rating grades when the new credit class has only a limited sample realized default rates. I
derive the algorithm to make these adjustments and demonstrate the technique for default
data on Moodys alpha-numeric rating scale over the period 2001-2008.
The next section reviews the Vasicek-Basel AIRB portfolio default rate model.
Section III discusses the new technique for estimating the Vasicek-Basel model
parameters. Section IV discusses important estimation issues that arise because observed
default rates exhibit strong positive correlation. Section V discusses the Moodys
corporate default rate data. Section VI discusses specific estimation issues that arise
because of the prevalence of zero default rates in the data. Section VII discusses the
model parameter estimates and parameter test statistics. Section VIII analyzes the small
sample estimation bias problem. Section IX discusses the small sample correction
algorithm that controls for the common factor realizations and produces consistent
estimates. A final section summarizes the results and concludes.
6

II. THE VASICEK PORTFOLIO CREDIT LOSS DISTRIBUTION MODEL
The Gaussian single factor model of portfolio credit losses (a.k.a. the Vasicek
model), developed by Vasicek (1987), Finger (1999), Schnbucher (2000), Gordy (2003)
and others, provides an approximation for the distribution of the default rate on a well-
diversified credit portfolio. The Vasicek model is a default-mode model meaning that all
credits are assumed to either perform or default within the models risk measurement
horizon. The asymptotic version of Vasicek model focuses on a large diversified
portfolio in which idiosyncratic risk is fully diversified and the only source of portfolio
loss uncertainty is the default rate that is driven by the common latent Gaussian factor.
5

The model measures the aggregate value of the losses generated by defaulting credits and
the income earned on non-defaulting credits is not recognized.
6

The Vasicek model assumes that uncertainty on credit i is driven by a latent
unobserved factor,
i
V
~
, with the following properties:
. , , 0 )
~ ~
( )
~ ~
(
), ( ~
) ( ~
~
~
1
~
~
j i e e E e e E
e e
e e
e e V
j M j i
i id
M M
i V M V i
= =
+ =
|
|

(1)
where
) ( |
represents the standard normal density function.
i
V
~
is distributed standard
normal, ( ) , 0
~
=
i
V E and ( ) ( ) ( ) . 1
~ ~ ~ 2
2 2
= =
i i i
V E V E V o
i
V
~
is often interpreted as a proxy for

5
The model assumes the unconditional probability of default, exposure at default, and
loss rates in default (LGD) are known non-stochastic quantities for all obligors.
6
Kupiec (2006, 2007) develops a more general version of the Vasicek model in which
interest income on defaulting credits is recognized and offsets losses on defaulting
credits.
7
the market value of the firm that issued credit . i The common factor,
M
e
~
, induces
correlation between individual credit latent factor realizations,
( )
( ) ( )
.
~ ~
~
,
~
j i
j i
V
V V
V V Cov
o o
=
Credit i is assumed to default when its latent factor takes on a value less than a
credit-specific threshold,
i i
D V <
~
. The unconditional probability that credit i defaults is
( ),
i
D PD u = where ( ) u represents the cumulative standard normal density function.
Time is not an independent factor in this model, but is implicitly recognized through the
calibration of input values for PD.
The loss rate on a portfolio of credits that have identical correlations, , and
default thresholds, , D D
i
= is determined as follows. Define a default indicator function
for each credit,

<
=
otherwise
D V if
I
i
i
0
~
1 ~
(2)
i
I
~
has a binomial distribution with an expected value of ( ). D u Define X
~
to be the
proportion of credits in the portfolio that default,
n
I
X
i
n
i
~
~
1

=
= .
In an asymptotic portfolio, the number of individual credits is assumed to increase
without bound, . n In the limit, idiosyncratic risks are completely diversified within
the portfolio and portfolio default rate uncertainty is driven by the common factor alone.
The unconditional distribution function of X
~
, the asymptotic portfolios default rate, is
given by,
| |
( ) ( )
| | 1 , 0 ,
1 ~
Pr
1 1
e
|
|
.
|

\
|
u u
u = s

x
PD x
x X

(3)
Conditional on a specific draw of the common factor,
M M
e e =
~
, the conditional
value of the default indicator for a single credit is,
8
| |

<

= =
otherwise
e
e D
if
e e I
i
M
M M i
0
~
1
1
~
|
~

(4)
When the correlation and default thresholds are identical for all credits in the portfolio,
the conditional indictor functions assigned for each credit are independent and identically
distributed. As the number of credits in the portfolio increase without bound, conditional
on a value of
M
e , the portfolio default rate distribution converges (almost surely) to a
non-random value that depends on
M
e ,
| | | |
( )
|
|
.
|

\
|

u
u =
|
|
.
|

\
|

u =
=
(
(
(
(

=
= =

1 1
~
|
~
~
|
~
lim
~
|
~
lim
1
.
1
M M
M M i
s a
M M
n
i
i
n
M M
n
e PD e D
e e I E
n
e e I
e e X
. (5)
Equation (5) implies, conditional on a specific realization of the common factor, an
asymptotic portfolios default rate is a non-random value fully determined by two
parameters, the credits unconditional default rate, PD, and the Vasicek correlation
parameter, .
Intuition can be gained by examining a simulated time series of default rates on a
set of asymptotic portfolios generated by equation (5). Consider a simulated times series
of default rates generated by four portfolios, each representing a different credit grade
within a rating system. These portfolios are distinguished by their credits unconditional
probabilities of default which are assumed to be: ( , 00 . 1 3 , 70 . 2 , 25 . 0 1 = = = P P P
and 00 . 2 4 = P ). All of portfolios are assumed to have an identical correlation parameter,
20 . 0 = .
9
Figure 1: Simulated Time Series of Default Rates on Four
Asymptotic Portfolios
0.00%
1.00%
2.00%
3.00%
4.00%
5.00%
6.00%
7.00%
8.00%
1 3 5 7 9
1
1
1
3
1
5
1
7
1
9
2
1
2
3
2
5
2
7
2
9
3
1
3
3
3
5
3
7
3
9
4
1
4
3
4
5
4
7
4
9
time
P
o
r
t
f
o
l
i
o

D
e
f
a
u
l
t

R
a
t
e
P1
P2
P3
P4

Figure 1 plots simulated times series of default rates on these four hypothetical
asymptotic portfolios. The plots show that the default rates on these portfolios are very
highly correlated because the idiosyncratic risk of default is completely diversified and
the default rate is driven by only a single common factor. Table 1 reports the sample
default rate correlations. The Vasicek model correlation parameter is 20 percent for all
credits and yet the portfolio default rate realizations are nearly perfectly correlated.
7

Figure 1 and Table 1 show that, under the asymptotic Vasicek model, realized portfolio
default rates will be nearly perfectly correlated regardless of the magnitude of the
correlation parameter .
8


7
The default rates would be exactly perfectly correlated except equation (5) applies
different non-linear transformations to the common Gaussian term .
M
e
8
Somewhat paradoxically, Table A1 in the Appendix shows that the correlations among
portfolio default rates actually decline as the Vasicek correlation parameter, , increases.
10
P1 P2 P3 P4
P1 1.000 0.996 0.992 0.977
P2 1.000 0.999 0.992
P3 1.000 0.996
P4 1.000
Table 1: Simulated Asymptotic Portfolio Sample Default Rate
Correlations


III. ESTIMATION OF THE ASYMPTOTIC VASICEK MODEL PARAMETERS
The parameters of the asymptotic Vasicek portfolio model can be consistently
estimated using panel data regression techniques. I adopt the common practice of
identifying a credit rating or grade with its unconditional probability of default. Equation
(5) implies that the default rate realization on an asymptotic portfolio of credits from an
identical grade is a nonlinear function of the unconditional default rate and correlation
parameters of its constituent credits.
Let
jt
P represent the realized default rate on portfolio
j
P in year t. Equation (5)
implies,
( )
Mt
j
jt
e
PD
P

u
= u

1 1
) (
1
1
(6)
Where
j
PD is the unconditional probability of default for a credit in rating category j ;
Mt
e is the realized value of the unobserved common Gaussian factor; and , , , 1 M j =
represent M individual credit rating categories.
Equation (6) is consistent with the theoretical predictions of the asymptotic
Vasick default rate model, but in reality, observed default rates my deviate from their
theoretically predicted value by some mean-zero error term,
it
c
~
. To complete the
11
empirical model, I assume that error terms are independent and identically distributed
across time, and uncorrelated cross-sectionally,
( )
( )
( )
2 2
~
0 )
~ ~
(
, , , 3 , 2 , 1 , 0
~ ~
0 )
~ ~
(
, 0
i it
jt it
k jt it
jt it
it
E
i j E
j and k for E
i j E
t i E
o c
c c
c c
c c
c
=
= =
= =
= =
=
+
(7)
The final condition allows each credit grade to be characterized by a different error
variance. Recognizing the possibility of the mean-zero model error, the empirical
specification of the Vasicek model is,

( )
it Mt
A
jt
e
PD
P c

~
1 1
)
~
(
1
1
+

u
= u

(8)
Consistent Estimation of the Transformed Model Parameters
Define ) (
1
jt jt
P y

u = ;
jt
y is the observed portfolio default rate transformed
through the inverse normal distribution function. Unlike the observed default rate,
jt
y is
not bounded between 0 and 1, but is instead a continuous variable in the range . Let
( )

u
=

1
1
j
j
PD
a ;
j
a is a time-invariant constant determined by the characteristic credit
class s j' unconditional probability of default and the Vasicek correlation parameter.
Define
Mt t
e b

=
1
;
t
b is a scalar multiple of the common Gaussian factor
realization. Notice that the scalar multiple for
Mt
e is not dependent of the asymptotic
portfolios credit rating but only depends of the credits correlation parameter. Using
these definitions, equation (8) can be written in more simplified notation as,
12
jt t j jt
b a y c
~ ~
+ + = (9)
Under the model assumptions, the parameter
j
a can be consistently estimated as
the sample average from a time series of default rate realizations on the credits in rating
category j . The sample average estimate is,
j
T
t
jt
T
t
t
j
T
t
jt
a
T T
b
a
T
y

~ ~
1 1 1
= + + =

= = =
c
(10)
as , 0
.
1

=
s a
T
t
t
T
b
and 0
~
.
1

=
s a
T
t
jt
T
c
under the model assumptions.
A consistent estimate for
t
b can be derived from the average of the residuals from
a cross section of credit grades after each of the observed default rates are corrected for a
time series estimate of its constant term,
( )
t
M
j
jt
t
M
j
j jt
b
M
b
M
a y

1 1
= + =


= =
c
. (11)
The model assumptions require 0
. .
1

=
s a
M
j
jt
M
c
as the number of independent credit
grades becomes large.
Ignoring the precision of the constant parameter estimates ( ) ' s a
j
, the precision
of the scaled macro factor estimates will depend on the number of independent credit
grades included in a cross section as well as the magnitude of the model error variance
associated each credit grade. Smaller model error variances ( M i
i
, , 3 , 2 , 1 ,
2
= o ) and a
larger number of cross sections will improve the precision on the scaled macro factor
13
estimates. If the model fits the data from each credit category poorly (large residual
variances) and there are few independent credit grades, the estimates cannot be expected
to provide a very accurate representation of the actual macro factor realizations. I will
provide a more detailed analysis of the small sample properties of the common factor
estimates from the model in a subsequent section.
The empirical model can be written in a panel regression format. To simplify the
notation and maintain clarity, I write the model in terms of three credit grades but the
notation generalizes to include additional credit grades. Let ( )
it it it it
D D D X 3 2 1 = be
a vector of selection covariates that indicate membership in a credit grade (rating
category); for example, if
it
y is the transformed default rate associate with credit grade 1,
t it
y y
1
= and ( ) 0 0 1
1
=
t
X . Similarly, if
it
y is the transformed default rate associated
with credit grade 2,
t it
y y
2
= and ( ) 0 1 0
2
=
t
X . Define ) 2 1 (
2 it i it it
T t t t t = to
be a selection matrix that identifies the year associated with observation
it
y . For example,
when
1 i it
y y = , a default rate observation from year 1, ) 0 0 0 0 1 (
1
=
i
t ; when
the observation is from year 3, ) 0 0 1 0 0 (
3
=
i
t . Using this notation, an
empirical model for a generic portfolio default rates observation is,
( ) ( )
it
T
it T
T
it it
b b b b X a a a y c t
~
3 2 1 3 2 1
+ + = (12)
where
it
c
~
is the residual term.
Normally, in an analysis of covariance setting, the parameters in equation (12)
would be estimated using least squares (OLS) or weighted least squares after dropping a
14
dummy variable for one year.
9
In the present case, we can make use of an additional
model restriction to identify a complete set of individual year effects. The Vasicek model
assumes that the common Gaussian factor is a standard normal variable. This assumption
imposes a restriction that the average time effect is 0. This additional assumption allows
for the consistent estimation of all of the model parameters by estimating equation (12)
under the restriction, . 0
1
=

=
T
t
t
b The efficiency of model parameter estimates can be
improved by correcting heteroskedasticity using generalized least squares, as the residual
variances differ among rating classes.
Estimation of the Correlation and Common Factor Realizations
Restricted OLS provides consistent estimates of the models parameters, including
the rating-grade dependent intercepts and consistent estimates of the individual time
effects, . , , 3 , 2 , 1

T T b
t
= Because the Gaussian common factor, ,
~
Mt
e in the Vasicek
model has a standard deviation of 1 by assumption and the
t
b

estimators are consistent,

=
T
t
t
b
T
1
2

1
is a consistent estimator for,

1
. It follows that the consistent estimators
for the
Mt
e series are given by ,

=
=
T
t
t
t
Mt
b
T
b
e
1

) , , 3 , 2 , 1 ( T T = and

=
=
+
=
T
t
t
T
t
t
b
T
b
T
1
2
1
2

1
1

1

is a consistent estimator for . Finally, ( ) 1 u
i
a is a consistent estimator of the
unconditional probability of default for rating grade , 3 , 2 , 1 , = i i and so all the

9
Weighted least squares could correct for heteroskedasticity if the variance of the error
term was different for each rating grade.
15
underlying parameters of the Vasicek model are identified using this panel regression
approach to model estimation.

IV. AUTOCORRELATION ISSUES
It is intuitively reasonable to think that, should the model be estimated using a
long time series of portfolio default rate data, the economic effect of the common factor
on the default rate should average out over the sample since the Vasicek model assumes
that the common factors average effect is 0. The longer the time series, the better the
sample average of the unobserved common factor realizations should approximate 0
because, under model assumptions,
Mt
e
~
are independent identically distributed random
variables with mean 0. In reality, observed default rates are positively correlated across
time which may be indicative of positive autocorrelation in the common factor
realizations.
The Vasicek model structure does not include the possibility of business cycles in
the data as there is no autocorrelation or time-series structure in the common Gaussian
factor specification. If credit ratings are updated annually to project a constant
conditional default rate for each credit grade, and these implicit performance forecasts are
efficient, there is no reason to expect autocorrelation in the deviations from the credit
grades unconditional default rate. These arguments notwithstanding, observed default
rate data on rated corporate bonds exhibit positive autocorrelation; corporate default rates
exhibit clear evidence of credit cycles. Table 2 reports first-order autocorrelation
estimates for the annual default rates reported on selected corporate credits rated by
Moodys Investors Services and for the transformed default rates using the inverse
16
cumulative normal distribution. These estimates show evidence of strong positive
autocorrelation for the realized default rates and transformed default rates for all credits
except those rated Aa by Moodys.
intercept p-value p-value R
2
Aa default rate 0.055 0.005 -0.079 0.470 0.006
A default rate 0.049 0.077 0.412 <.001 0.172
Baa default rate 0.140 0.010 0.452 <.001 0.208
Ba default rate 0.506 0.007 0.511 <.001 0.261
B default rate 1.697 0.002 0.520 <.001 0.270
CaaC default rate 9.477 <.001 0.303 0.005 0.091
(Aa default rate) -3.860 <.001 -0.078 0.491 0.006
(A default rate) -1.170 <.001 0.666 <.001 0.444
(Baa default rate) -1.715 <.001 0.478 <.001 0.231
(Ba default rate) -1.031 <.001 0.628 <.001 0.398
(B default rate) -1.176 <.001 0.494 <.001 0.245
(CaaC default rate) -1.218 <.001 0.314 0.003 0.100
Estimates are based on Moody's Corporate Default Rate Data, 1920-2008. Default rates
are the number of defaults in the year following a Moody's rating designation divided by
the number of rated credits in a credit grade. Default rates are measured as perc
Table 2: Credit Cycles in the Realized Default Rates on Rated Corporate Credits
lagged
dependent
variable dependent variable

There is little published literature on the properties of credit cycles, e.g., their
average length, symmetry, amplitude of boom and bust phases, or other features. The
AR 1 default rate model estimates reported in Table 2 suggest that shocks to default rates
have lingering measurable effects for about three years, but the simple specification may
be too simplistic to fully capture the full dynamics of default rate behavior.
While the Vasicek model does not recognize the possibility of autocorrelation in
realized default rates, if the model is estimated over a long time series, even if default
rates are autocorrelated, the effect of credit cycles will average out to 0, and the
identifying restriction 0
1
=

=
T
t
t
b will be appropriate. Consider a case in which a sample
includes multiple complete cycles and a partial cycle. Estimating equation (12) while
17
imposing the restriction 0
1
=

=
T
t
t
b will result in biased estimates of the time effects as the
actual sample will not be balanced between positive and negative deviations from the
unconditional mean so the restriction is inappropriate. The effect of partial cycles in a
sample will diminish as the sample time series lengthens and includes more and more
complete cycles. In the limit, even if there are credit cycles in the default data, the
restriction 0
1
=

=
T
t
t
b will hold exactly as . T
The importance of using a long time series to estimate equation (12) can be
illustrated in the context of a simple autoregressive model. Let
t
R
~
represent a time series
generated by a stationary autoregressive process,
. 1 ), , 0 ( ~
~
,
~
~
1 1 1 0
< + + =

o o | o o
t t t t
u u R R (13)
The unconditional sample mean of the process is,
1
0
1 o
o

. The estimator for the


unconditional sample mean is,
T
u
T
R
T
R
T
t
t
T
t
t
T
t
t
=

= =
+ + =
1
1
1
1 0
1
~
o o .
As the sample size T becomes large, , 0
~
.
1

=
s a
T
t
t
T
u
and
T
R
T
R
T
t
t
s a
T
t
t
=

=

1
.
1
1
. Thus, as
as , T
1
0
. .
1
1 o
o

=
s a
t
T
t
T
R
.
In equation (12), transformed default rates are equal to an unconditional mean
value plus a mean zero independent identically distributed innovation that includes the
effect of a macroeconomic factor. What happens if true transformed default rates follow
18
a stationary autoregressive process similar to equation (13) but we estimate equation (12)
without accounting for autocorrelation? Let
t
R represent the transformed autocorrelated
default rate series and define
t
v to be the realized deviations from the sample mean,
T
R
R v
T
t
t
t t

=
=
1
. In any sample, OLS estimation will ensure

=
=
T
t
t
v
1
0 by construction.
The prior discussion established that, in small samples,
T
R
T
t
t
=1
~
is biased estimator of
1
0
1 o
o

and so the individual errors,


t
v , are biased as well. As the sample length
T increases, the sample mean converges to the true unconditional series average, and so
the
t
v estimates converge to the true macro factor innovations in equations (12)
notwithstanding the autocorrelation in default rates. While this discussion formally
establishes the consistency of
t
v for an AR 1 process, it can be shown that the result also
holds for any higher order stationary autoregressive process.
Thus far I have shown that OLS provides consistent estimates of the transformed
unconditional default rates and scaled macro factor shocks from a long time series of
observations when common factor realizations are autocorrelated, but the discussion thus
far does not provide any evidence on the length of the sample that is needed to obtain
reasonably accurate estimates when autocorrelation is ignored in the estimation.
Table 3 provides some evidence on the properties of the small sample distribution
of the sample mean estimate from two autoregressive process that could be representative
of default rate dynamics. One process examined is the estimated AR 1 process for Baa
rated credits reported in Table 2. The other process is the empirical AR 1 model for the
19
Ba default rate also reported in Table 2. To analyze the small sample properties of the
sample mean estimate from a time series of default rates, the AR process is simulated 100
times with 121 observations in each sample. The first 21 observations from each sample
are omitted to remove the effect of initial conditions.

Table 3: Sampling Distribution for the Simple Average Estimated from a
Sample Generated by Alternative Autoregressive Processes
sample size 10 20 30 40 50 100
Baa process: R
t
=0.141+0.4557 R
t-1
+e
t,
e
t
~(0,0.4191)
average 0.249 0.257 0.257 0.262 0.263 0.251
std dev 0.207 0.174 0.136 0.121 0.105 0.075
minimum -0.307 -0.091 -0.185 -0.040 0.050 0.082
maximum 0.712 0.701 0.663 0.641 0.586 0.429
Ba process: R
t
=0.5037+0.5145 R
t-1
+e
t,
e
t
~(0,1.40)
average 1.092 1.099 1.079 1.103 1.090 1.076
std dev 0.723 0.570 0.465 0.406 0.362 0.305
minimum -0.379 -0.651 -0.353 -0.092 -0.076 0.367
maximum 2.972 2.437 2.221 2.164 2.046 2.177
Sampling distribution for the simple sample mean of two autoregressive processes based
on 100 bootstrap replications of the indicated sample size. The autoregressive process are
the empirical AR (1) models for the Baa and Ba default rate processes with parameter
estimates given in Table 2. The true unconditional sample averages for the AR (1) process
are 0.2590 for the Baa default rate process and, 1.0375 for the Ba default rate process.
Each bootstrap sample begins after 21 excluded iterations attenuate the effects of initial
conditions.

For each of the100 samples, estimates of the sample mean from alternative
subsample lengths are calculated and the characteristics of the sampling distributions are
reported in Table 3 for different sample sizes. The sampling distributions of the sample
mean estimator converge toward the true unconditional mean of each process, but the rate
of convergence is slow. Even with 100 observations in a time series sample, the sample
mean estimate still exhibits bias and significant variability. A comparison of the
alternative processes shows that convergence is faster for mean estimate of the Baa
20
process. This result is intuitive as the Baa process has weaker autocorrelation and a
smaller standard error in its independent Gaussian innovation.
To summarize the discussion thus far, these results demonstrate that it is possible
to generate consistent estimates of each credit grades transformed unconditional default
rate as well as consistent estimates of the time series of scaled common factor realizations
using a long time series of realized default rates on a cross section of portfolios of
consistently-rated credits. While these parameters can be consistently estimated,
estimates derived from as much as 100 years of data are likely to have substantial
variability given the degree of autocorrelation in observed Moodys Rated Corporate
default rate data. In general, the characteristics of the rating agency default rate data are
not conducive to producing highly accurate estimates of each credit grades unconditional
probability of default or for producing highly accurate estimates of the common macro
factor realizations. Thus data constraints place practical limits on our ability to estimate
these parameter values with a high degree of resolutioneven with 100 years of data.

V. PORTFOLIO DEFAULT RATE DATA
The parameters of the Vasicek asymptotic portfolio default rate distribution will
be estimated using annual default rate data on six different credit rating categories for
Corporate bonds over the period 1920-2008 as reported by Moodys Investors
Corporation (2009) (Moodys). For each of its credit rating grades, Aaa, Aa, A, Baa, Ba,
B, and CaaC, Moodys publishes annual default rate performance data. Default rates are
calculated as the number of issuers that were in a credit grade at the beginning of a year
21
and defaulted within the year, divided by the number of issuers that were in the credit
grade at the beginning of the year.
10

Moodys rates a large number of issuers in each year of the sample and most of
the rating categories include a large number of bonds in each annual cohort although
some default rate observations are associated with relatively few bonds.
11
In the analysis,
I assume that each annual default rate observation is an approximation for the annual
default rate on an asymptotic portfolio of credits and I interpret a Moodys credit ratings
as an indicator of the issues unconditional probability of default.
12
According to this
interpretation, each credit rating grade represents a group of obligors that have
approximately the same unconditional default rate over an annual horizon. I assume that
the targeted unconditional default rates associated with each individual credit grade are
fixed over the sample period.
While the results of this study only provide direct evidence on the asymptotic
Vasicek model (Basel II AIRB) fit relative to long term corporate bonds rated by

10
Moodys makes some numerical adjustment for issuers whose ratings were withdrawn
within the year. Moodys has argued that this adjustment has little effect on the reported
default rate data, and for purposes of this analysis I will ignore any issues created by
ratings withdrawals.
11
Moodys does not publish data on the number of bonds in each rating grade and cohort
for the entire sample period. Moodys (2009) does provide partial information on the
number of bonds in a rating grade and this data indicated that in some sample years, the
CaaC grade included relatively few bonds. In the analysis that follows, I will assess the
importance of the small portfolio size for CaaC credits by reporting estimates excluding
this category. I am indebted to Matt Pritsker for calling my attention to this issue when he
discussed an earlier draft of this paper.
12
Moodys would argue that a credit rating reflects an assessment of the expected
performance of an issue along multiple (unspecified) dimensions and does not represent a
ranking based only on the probability of default over a fixed horizon. This issue
notwithstanding, it is common to interpret a credit agency rating as an implicit estimate
of an issues probability of default.
22
Moodys, these data play an important role in the Basle II framework. Not only did
rating agency data play an important role in the development of Basle II, but under Basel
II implementation standards, banks that lack a long time series of data on the default rate
performance of their own internal credit rating systems are permitted to map or
benchmark their internal systems to agency ratings, and use default rate data published
by the credit rating agencies to calibrate the probability of default inputs for their Basle
AIRB regulatory capital calculations.
13
Rating agency data on corporate bonds,
moreover, provide the longest data series available on the default rate performance of
issues that were rated over time according to a consistent set of criterion.
The Moodys annual default rates for Aa, A, Baa, Ba, B and CaaC rated credits
are plotted in Figure 2 in two separate panels to accommodate differences in default rate
scales. The Aaa-rating grade is excluded from the analysis because, according to the
Moodys data, no Aaa-rated credits defaulted within the first year after being rated Aaa,
and so the data provide no information on the 1-year unconditional default rate associated
with an Aaa rating.
14


13
This rating agency mapping approach is described in the Basel Committee on
Banking Supervision (June 2004), page 94, paragraph 462.
14
An unconditional default rate of 0 is not a realistic estimate as these corporate credits
certainly have some associated default risk even if default is a remote event. This issue is
discussed in more detail below. There are alternative techniques that can be used to infer
the 1-year probability of default on these issues using transition matrix estimators. See for
example, Lando and Skdeberg (2002).
23
Figure 2: Moody's Corporate Issuer-rated Default Rates: 1920-2008
0.0
0.5
1.0
1.5
2.0
2.5
Y
e
a
r
1
9
2
3
1
9
2
7
1
9
3
1
1
9
3
5
1
9
3
9
1
9
4
3
1
9
4
7
1
9
5
1
1
9
5
5
1
9
5
9
1
9
6
3
1
9
6
7
1
9
7
1
1
9
7
5
1
9
7
9
1
9
8
3
1
9
8
7
1
9
9
1
1
9
9
5
1
9
9
9
2
0
0
3
2
0
0
7
Aa A Baa
0
10
20
30
40
50
60
70
80
90
100
Y
e
a
r
1
9
2
3
1
9
2
7
1
9
3
1
1
9
3
5
1
9
3
9
1
9
4
3
1
9
4
7
1
9
5
1
1
9
5
5
1
9
5
9
1
9
6
3
1
9
6
7
1
9
7
1
1
9
7
5
1
9
7
9
1
9
8
3
1
9
8
7
1
9
9
1
1
9
9
5
1
9
9
9
2
0
0
3
2
0
0
7
Ba B Caa-C

The plots in Figure 2 show that the default rates on these ratings classes are
positively correlated. Table 4 reports the sample correlation estimates along with the
sample average annual default rates. While the default rates show evidence of reasonably
strong positive correlation, the sample correlations are not nearly as strong as those that
are implied by the Vasicek asymptotic model for portfolio default rates. While
measurement error (discussed below) is expected to lower observed default rate
24
correlations, the correlation estimates from the Moodys data are far smaller than those
that would be expected under the ideal correlations reported in Table 1.
Aa A Baa Ba B CaaC
Aa 1 0.587 0.357 0.221 0.083 0.022
A 1 0.697 0.363 0.113 0.039
Baa 1 0.556 0.331 0.212
Ba 1 0.711 0.350
B 1 0.642
CaaC 1
0.063 0.092 0.271 1.063 3.395 13.103
Table 4: Correlation Among Moody's Corporate Bond Annual Default Rates for
Alternative Ratings Grades, 1920-2008
average default rate
(%)


VI. DATA ISSUES
Before estimating the model, there are a number of data issues that merit
discussion. The plots in Figure 2 show many observations for which the reported annual
default rates for a credit class are 0, and indeed there are 12 years of data on which there
are no recorded defaults in any of the credit rating classes within that year.
15
Under the
assumptions of the Vasicek model, there is virtually no probability that an asymptotic
portfolio with a positive unconditional probability of default should experience 0
defaults, and yet in almost 14 percent of the sample years, there are no recorded defaults
on any rated credits. Similarly, a 100 percent default rate should be an extremely rare
occurrence and, according to the Vasicek model, must coincide with very high default
rates on all portfolios contemporaneously, a pattern which is not exhibited in the sample.
The prevalence of zeros in the Moodys default rate data (as well as the 100
percent default rate reported for CaaC credits in 1984) may not be inconsistent with the

15
There are no recorded defaults in any of the credit rating grades in 1946, 1948, 1950,
1952, 1953, 1956, 1958, 1959, 1964, 1965, 1967, and 1969.
25
Vasicek-AIRB model if the Moodys rating grade portfolios are not truly asymptotic
portfolios, and surely they are not. The number of credits in each rating class is limited
and so the observed default rates include measurement error.
10000 1 bps
2000 5 bps
1000 10 bps
500 20bps
200 50 bps
100 100 bps
number of obligors in
a credit grade
upper bound on the
magnitude of
measurement error
Table 5: Potential Measurement Error and
Portfolio Size


To understand the measurement error issue, consider a portfolio of 1000
independent obligors in a single credit grade that did not experience a default within a
year. This portfolio is certainly not an asymptotic portfolio even though it is likely to be
well-diversified by any practical standard. Consider the measured default rate on this
portfolio when we add a single credit and the new credit subsequently defaults. This
thought experiment provides the upper bound on this portfolios default rate, 0.000999 or
roughly 10 basis points. While the observed default rate is 0, the true unobserved default
rate could be as large as 10 basis points given the information in the portfolio. Table 5
illustrates the relationship between the number of independent obligors in a credit grade
and the magnitude of the upper bound on the potential measurement error associated with
the credit grade observed default rate.
Conceptually, true asymptotic default rates of zero happen with zero probability,
and yet the observed default rates may be zero simply because the portfolios we observe
26
do not include enough credits. The number of credits in a zero default rate portfolio can
be used to estimate the upper bound on that portfolios default rate for the year, but this
estimate almost surely overstates the true unobserved asymptotic portfolio default
experience.
Default rates of zero are problematic for estimation purposes as well. The inverse
normal transformation in equation (6) will not accommodate default rates of 0 (the
transformation results in a value of ) or 100 percent ( + ) and so these extreme
default rate observations must be truncated for estimation. There are many reported
default rates of 0 in the sample and so the truncation value assigned to 0 default rate
observations could have a measurable effect on the model estimates. I report the
estimation results using alternative lower bound values for portfolio default rates.
In contrast to the 0 default rate problem, there is only one default rate of 100
percent in the sample.
16
A 100 percent default rate is also the likely result of
measurement error because the portfolio is not truly asymptotic. Because there is only
one observation with a 100 percent default rate, the truncation value that is selected for
that observation (within reasonable bounds) has little effect on the results I report.
Regardless, I also truncate this default rate using the rule: 100 percent minus the lower
bound used to truncate 0 default rates in the sample.
17

In addition to the issue of selecting a lower bound for observed default rates,
another potentially important issue for estimation is how to handle the 12 years of data
for which there are no observed defaults in any rating category. These years almost

16
The default rate reported for the CaaC grade in 1984 is 100 percent.
17
For example, if 0 default rates are truncated to .0001, then the single 100 percent
default rate observation is truncated to 1-.0001 or 99.99 percent.
27
certainly represent years when there was a very strong economy (i.e., a large positive
draw from the common Gaussian factor) which is important information, but this
information alone is not very informative as it does not identify how good the economy
was, nor can it be used to identify unconditional default rate differences among the credit
rating grades. While the data can be included in the estimation sample with an additional
restriction that the time dummy variable takes on the same value for each of these years
(since they are all equally good according to the data), including these data with
uniformly truncated default rates may cause some additional bias issues. For example,
including all these data points at a common lower default rate boundary will not only
alter the estimate for the unconditional probability of default, but it also may impart an
upward bias to the estimate of the Vasicek correlation parameter.
To address this potential source of bias, one might be tempted to exclude these 12
years of data from the estimation sample. The Vasicek-AIRB model does not include any
time dependence in the Gaussian factor structure, so the omission of these dates does not
cause any dynamic issues in the model. However, estimation using a censored sample
would impart an upward bias on the estimates of the unconditional default rates and a
downward bias on the estimates of the model correlation parameter. So while the
censored sample approach cannot be expected to produce unbiased estimators of the
model parameters, it still may be useful to assess the sensitivity of parameter estimates to
the inclusion/exclusion of the 0 default years.

VII. MODEL ESTIMATION AND TESTING
Panels A through D of Table 6 report model parameter estimates under alternative
assumptions regarding the lower bound on portfolio default rates. In Panels A through C
28
the model coefficient estimates that correspond to credit grade covariates are statistically
significant and monotonically increasing (from grade Aa to grade CaaC), a pattern that is
expected under the Vasicek model if the probability of default for a credit grade increases
monotonically as ratings progress from Aa to CaaC. The monotonic pattern does not
hold for the estimates in Panel D when default rates that reported to be 0 are truncated at
50 bps probability of default.
The estimates in Panels A, B, and C of Table 6 show, unsurprisingly, that as the
lower bound on the portfolio default rate is increased, the estimated implied
unconditional probability of default increases across all credit grades. At the same time,
the estimates in Panels A through C show that progressive increases in the lower bound
on portfolio default rates results in a reduction in the estimated value of the Vasicek
model correlation parameter. The overall effect on correlation can be substantial; the
correlation parameter estimate falls from almost 20 percent when the truncation value is 1
basis point, to 5.5 percent when the truncation value is assumed to be 20 basis points.
The results in Table 6 demonstrate that the importance of the treatment accorded
0 default rate observations when estimating the parameters of an asymptotic portfolio risk
model. The issue of how best to select an optimal lower bound for portfolio default rates
remains an important open issue. My earlier discussion suggests that the truncation rate
should be related to the potential measurement error in the data which in turn will depend
on how many credits are in each rating grade in each annual cohort.
18
Based on the
number of credits in the individual Moodys rating categories in most years, a truncation
value of 20 basis points or larger is probably necessary to achieve a conservative estimate

18
This data is not publicly available for the entire sample, and further analysis of this
issue, while important, is beyond the scope of this study.
29
of a ratings grade unconditional probability of default as few cohort portfolios in the
sample routinely include as many as 500 separate obligor rated credits.

Moody's implied
rating parameter standard t statistic * unconditional
grade estimate error PD in bps
Aa -3.581 0.061 -51.14 6.7
A -3.512 0.058 -50.16 8.3
Baa -3.270 0.053 -46.7 17.0
Ba -2.760 0.058 -39.41 67.3
B -2.311 0.064 -33 192.6
CaaC -1.811 0.086 -25.86 524.1
0.198
Aa -3.030 0.043 -71.24 18.8
A -3.011 0.041 -72.75 19.9
Baa -2.882 0.039 -74.68 29.3
Ba -2.548 0.040 -64.13 74.2
B -2.148 0.046 -46.72 199.7
CaaC -1.628 0.066 -24.51 598.0
0.085
Aa -2.844 0.037 -75.99 28.4
A -2.842 0.037 -76.18 28.7
Baa -2.751 0.036 -77.49 37.4
Ba -2.476 0.035 -70.22 80.3
B -2.093 0.041 -51.21 209.1
CaaC -1.566 0.061 -25.75 640.0
0.055
Aa -2.579 0.032 -71.24 55.4
A -2.600 0.034 -72.75 52.3
Baa -2.564 0.034 -74.68 57.9
Ba -2.374 0.032 -64.13 96.9
B -2.015 0.036 -46.72 236.0
CaaC -1.477 0.055 -24.51 728.8
0.030
Parameter estimates are generalized least squares estimates of equation (12)
based on 89 years of Moody's annual default rate data. All reported t test
statistics are significanly different from zero at the .0001 level of the test.
Panel C: 0 default rates truncated to 20 bps
Panel D: 0 default rates truncated to 50 bps
Table 6: Asymptotic Vasicek Model Estimates based on
Moody's Corporate Bond Rating Annual Performance Data
1920-2008
Panel A: 0 default rates truncated to 1 bps
Panel B: 0 default rates truncated to 10 bps

30
Moody's implied
rating parameter standard t statistic * unconditional
grade estimate error PD in bps
Aa -3.560 0.016 -57.71 5.4
A -3.480 0.059 -59.04 7.0
Baa -3.200 0.057 -56.32 16.6
Ba -2.610 0.064 -40.8 83.1
B -2.091 0.069 -30.46 274.9
CaaC -1.514 0.092 -16.49 823.9
0.158
Aa -3.021 0.049 -70.38 17.6
A -2.998 0.042 -72.02 18.8
Baa -2.849 0.040 -71.02 29.5
Ba -2.463 0.044 -56.03 86.5
B -2.001 0.049 -40.91 265.6
CaaC -1.400 0.070 -20.02 881.3
0.066
Aa -2.839 0.038 -74.82 27.6
A -2.836 0.038 -75.00 27.9
Baa -2.731 0.037 -73.94 38.1
Ba -2.414 0.039 -61.85 91.7
B -1.971 0.044 -45.15 270.3
CaaC -1.361 0.064 -21.27 917.5
0.045
Aa -2.580 0.033 -77.98 54.5
A -2.604 0.035 -73.69 50.9
Baa -2.562 0.036 -71.89 57.4
Ba -2.343 0.035 -66.93 104.0
B -1.928 0.038 -50.24 285.6
CaaC -1.306 0.058 -22.52 987.6
0.026
Table 7: Asymptotic Vasicek Model Estimates based on
Moody's Corporate Bond Rating Annual Performance
Data 1920-2008 Excluding Years No Rated Bond
Defaults
Parameter estimates are generalized least squares estimates of equation (12)
based on 77 years of Moody's annual default rate data. Years in which
there are no defaults among the bonds rated by Moody's are excluded from
the estimation sample. All reported t test statistics are significanly different
from zero at the .0001 level of the test.
Panel A: 0 default rates truncated to 1 bps
Panel B: 0 default rates truncated to 10 bps
Panel C: 0 default rates truncated to 20 bps
Panel D: 0 default rates truncated to 50 bps



31
Table 7 reports model parameter estimates when the estimation data set excludes
the 12 years of data for which there are no defaults recorded in any credit rating class.
These estimates also imply a monotonic (inverse) relationship among credit quality (Aa
highest quality, CaaC lowest quality) and the annual unconditional probability of default
on asymptotic portfolio of credits provided the truncation value assigned to zero default
rate observations is less than 50 basis points. The implied Vasicek model correlation
parameter varies from a high of 15.8 percent when the upper bound on measurement
error is assumed to be 1 basis point, to a low of 2.6 percent when the truncation value for
zero default rate observations is set to 50 basis points.
Moodys (2008) reports the number of issuers in each annual rating category
cohort from 1970-2008. These data show that the CaaC ratings category includes
relatively few issuers in many years of the sample, and so these CaaC default rate data
include relatively large measurement errors relative to the true asymptotic portfolio
default rates for CaaC-rated portfolios and other credit grade default rates reported by
Moodys. To assess the importance of this source of measurement error, I re-estimate the
model excluding the CaaC ratings grade. When the CaaC data are excluded from the
model, there are additional years in which there are no recorded defaults in the Aa, A,
Baa, Ba, or B rating grades.
19



19
The additional years are 1945, 1947, 1951, 1954, and 1968.
32
Moody's implied
rating parameter standard t-statistic unconditional
grade estimate error PD in bps
Aa -3.581 0.048 -73.93 5.3
A -3.512 0.045 -78.50 6.6
Baa -3.270 0.045 -73.21 14.0
Ba -2.760 0.050 -55.73 58.4
B -2.311 0.059 -39.41 173.6
0.165
Aa -3.030 0.030 -99.77 16.4
A -3.011 0.029 -105.37 17.5
Baa -2.882 0.028 -102.48 25.9
Ba -2.548 0.032 -80.89 67.3
B -2.148 0.041 -53 185.8
0.059
Aa -2.844 0.025 -112.07 26.0
A -2.842 0.025 -114.16 26.2
Baa -2.751 0.025 -112.52 34.4
Ba -2.476 0.027 -91.54 74.8
B -2.093 0.036 -58.63 198.4
0.035
Parameter estimates are generalized least squares estimates of equation (12)
based on 89 years of Moody's annual default rate data. All reported t test
statistics are significanly different from zero at the .0001 level of the test.
Panel A: 0 default rates truncated to 1 bps
Panel B: 0 default rates truncated to 10 bps
Panel C: 0 default rates truncated to 20 bps
Table 8: Asymptotic Vasicek Model Estimates based on
Moody's Corporate Bond Rating Annual Performance Data
1920-2008 Excluding CaaC Rating Grade


Table 8 reports estimates of the Vasicek-AIRB model parameters when the CaaC
ratings data are excluded from the estimation data set. The estimates reported in Table 8
are not materially different from the full sample estimates reported in Table 6, and
consequently I conclude that the measurement error bias introduced by including CaaC
credits in the estimation is not of first-order importance relative to all the other estimation
issues one must face when attempting to estimate this model. Since there are benefits to
33
be gained from having an estimate of the CaaC unconditional probability of default, I
include the CaaC data in the remaining analysis.

Standard Errors of Vasicek Model Parameter Estimates
The Vasicek model parameter estimates are nonlinear transformations of the
restricted OLS parameter estimates, and so the standard error of these estimates must be
obtained from an auxiliary analysis. In order to estimate the critical values and estimates
of the variability of the Vasicek model parameter estimates, I construct a bootstrap
sampling distribution for the parameter estimates when zero default rates are truncated to
two different values: 10 basis points and 20 basis points.
20
I draw 5000 paired samples
(with replacement) from the underlying estimation sample of 89 observations, and for
each bootstrap sample, I estimate the Vasicek model parameters and thereby build the
sampling distribution for the estimates based on 89 observations. By using paired draws,
sampling both the dependent as well as the independent variables simultaneously, I
preserve heteroskedasticity features of the data and incorporate the consequences thereof
in the sampling distribution (and standard deviations) of the parameter estimates.
21



20
Efron (1979) and other paper that develop bootstrap and jackknife techniques appear in
the References.
21
In other words, the bootstrapped sampling distributions are robust to any
heteroskedasticity in the data.
34
Aa A Baa Ba B CaaC
mean 18.78 19.95 29.38 74.59 201.57 608.45 0.085
median 18.68 19.79 29.08 74.04 199.41 594.94 0.085
mode 17.01 18.59 27.87 68.45 169.73 537.36 0.068
std dev 1.67 2.00 3.78 11.06 32.85 140.87 0.011
maximum 26.19 29.27 44.63 129.29 346.53 1350.87 0.128
quantiles
99 23.02 25.17 39.41 102.86 286.82 1000.48 0.111
95 21.75 23.48 36.01 93.90 259.96 858.70 0.103
90 20.97 22.64 34.33 89.15 245.01 793.24 0.099
10 16.74 17.53 24.75 60.94 161.97 440.04 0.070
5 16.23 16.92 23.64 57.43 150.97 398.88 0.067
Aa A Baa Ba B CaaC
mean 28.79 29.05 38.02 81.49 211.95 651.41 0.057
median 28.71 28.92 37.78 80.90 209.95 635.93 0.057
mode 25.21 24.69 29.48 64.29 141.94 443.02 0.040
std dev 1.66 1.95 3.65 10.08 30.66 138.39 0.009
maximum 35.29 38.27 54.11 135.14 345.97 1358.20 0.094
quantiles
99 33.08 33.99 47.42 106.98 290.34 1030.53 0.079
95 31.65 32.47 44.37 99.20 265.37 896.76 0.072
90 30.93 31.67 42.77 94.87 252.35 834.62 0.069
10 26.72 26.90 33.53 68.92 174.57 484.39 0.046
5 26.21 26.07 32.45 66.06 164.46 445.64 0.044
Model parameter estimates when 0 default rates are truncated to 10 basis points, 5000
paired sample bootstrap replications
Table 9: Sampling Distribution for Vasicek Model Parameter Estimates based on Moody's
Corporate Ratings Data, 1920-2008
Model parameter estimates when 0 default rates are truncated to 20 basis points, 5000
paired sample replications
Unconditional Probability of Default Parameters
Unconditional Probability of Default Parameters
Bootstrap sampling distribution esimates based on 5000 paired resampling of Moody's Investors
Corporate Bond Default Rate Data, 1920-2008. The model estimation restrictions are dynamically
modified to impose an identical macro factor value for all resampled observations for which there are no
default rates observed in any rating grade.

The re-sampled data can include multiple observations on any year of data in the
original sample. As a consequence, a single bootstrap sample can include any number of
observations for which there are no observed default rates for any of the rating categories.
In the bootstrap exercise, the model is estimated with restrictions on the macro factor to
require identical values for all years in the sample for which there are no observed
35
defaults in any credit grade. As a consequence, the actual restriction matrix imposed for
model estimation is potentially unique for each of the 5000 bootstrap replications.
The summary statistics for the sampling distributions of the Vasicek-AIRB model
parameter estimates are reported in Table 9. For the credit categories Aa through Baa,
unconditional default rates are fairly accurately estimatedstandard errors of the
unconditional default rate sampling distributions are less than 10 percent of the mean
value of the unconditional probability of default estimates when zero default rate
observations are truncated to 10 basis points. As credit ratings decline in quality from Ba
to CaaC, the precision of the unconditional default rate estimates declines. For the lowest
quality credits, CaaC, the standard deviation of the sampling distribution of unconditional
probability of default estimator is about 23 percent of the mean value when zero default
rate observations are truncated to 10 basis points. When the truncation value used for
zero default rate observations is increased to 20 basis points, the sampling distribution
dispersions decline uniformly across the credit grades, both absolutely and relative to the
mean of the sampling distributions.
The bootstrap procedure can also be used to generate the sampling distribution for
parameter hypothesis and model specification test statistics. For example, it may be of
interest to test whether there is a statistically significant difference among the rating
grades unconditional probabilities of default. The statistical significance of differences
in these unconditional default rates can be measured directly from the sampling
distribution for the difference between the parameter estimates. Table 10 reports
descriptive statistics for the sampling distributions of estimates of differences in
unconditional default rates associated with adjacent rating grades.
36

A-Aa Baa-A Ba-Baa B- Ba B-CaaC
mean 1.68 9.43 45.21 126.98 406.88
std dev 1.08 2.43 8.97 26.80 124.09
quantiles
99 4.05 15.88 68.40 197.79 748.07
95 3.05 13.80 60.79 174.52 627.36
5 -0.48 5.81 31.29 86.54 224.59
1 -1.12 4.71 27.25 72.45 171.35
A-Aa Baa-A Ba-Baa B- Ba B-CaaC
mean 0.27 8.97 43.47 130.46 439.46
std dev 1.01 2.44 8.22 24.99 122.59
quantiles
99 2.98 15.41 64.81 195.02 774.00
95 1.98 13.18 57.81 173.85 658.80
5 -1.28 5.27 30.88 92.39 260.42
1 -1.93 3.99 26.33 78.17 200.42
Table 10: Sampling Distribution for Differences in Vasicek Model
Unconditional Default Rate Estimates based on Moody's Corporate
Ratings Data, 1920-2008
Model parameter estimates when 0 default rates are truncated to 10 basis
points, 5000 paired sample bootstrap replications
Model parameter estimates when 0 default rates are truncated to 20 basis
points, 5000 paired sample bootstrap replications
Difference in Unconditional Probability of Default Parameters
Difference in Unconditional Probability of Default Parameters
Bootstrap sampling distribution esimates based on 5000 paired resampling of
Moody's Investors Corporate Bond Default Rate Data, 1920-2008. The model
estimation restrictions are dynamically modified to impose an identical macro
factor value for all resampled observations for which there are no default rates
observed in any rating grade.

The results reported in Table 10 suggest that all the rating grades except the Aa
and A categories differentiate credits according to their unconditional probability of
default. The results are qualitatively similar irrespective of whether zero default rates are
truncated to either 10 or 20 basis points. The results suggest that the unconditional
default probability associated with each rating grade increase monotonically from grade
A through grade CaaC. However, the 5- percent (and 1-percent) critical values of the
sampling distribution for the difference between A- and Aa-rated credits suggest that
these credit grades are not statistically different in terms of their unconditional probability
37
of default. Figure 3 plots the sampling distribution for the difference between the
unconditional default rate estimates for Aa- and A-rated credits based on 5000 paired
bootstrap samples.
-3.15 -2.55 -1.95 -1.35 -0.75 -0.15 0.45 1.05 1.65 2.25 2.85 3.45 4.05 4.65
0
2
4
6
8
10
12
P
e
r
c
e
n
t
a_aa

The estimation approach also produces consistent estimates of the realizations of
the single common factor that drives portfolio default rates in the Vasicek-AIRB model.
The common factor realization estimates are normalized year covariate parameter
estimates, . 2008 , , 1921 , 1920 ,

1
= =

=
j
b
T
b
e
T
t
t
j
j M

Figure 4 plots the mean and 90 percent probability bounds
22
of the sampling
distribution for the common factor realizations based on a bootstrap of 5000 paired
replications. Recall that, under the Vasicek-AIRB model, positive values of the common

22
The 90 percent probability bound is comprised of the 5- and the 95-percentile levels of
the of the common factors estimated sampling distribution.
38
factor are associated with low portfolio default rates whereas negative common factor
draws are associated with large default rates.
Estimates based on Moody's annual default rate data on rated corporate credits, 1920-2008. Zero default rate
observations are truncated to 10 basis points. The sampling distribution is calculated from 5000 paired bootstrap
replications. The pink line is the 95th percentile of the sampling distribution. The blue line is the mean of the
sampling distribution. The yellow line is the 5th percentile of the sampling distribution.
Figure 4: Sampling Distribution for the Vasicek Model
Common Factor Estimates, 1920-2008
-3
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
1
9
2
0
1
9
2
3
1
9
2
6
1
9
2
9
1
9
3
2
1
9
3
5
1
9
3
8
1
9
4
1
1
9
4
4
1
9
4
7
1
9
5
0
1
9
5
3
1
9
5
6
1
9
5
9
1
9
6
2
1
9
6
5
1
9
6
8
1
9
7
1
1
9
7
4
1
9
7
7
1
9
8
0
1
9
8
3
1
9
8
6
1
9
8
9
1
9
9
2
1
9
9
5
1
9
9
8
2
0
0
1
2
0
0
4
2
0
0
7

The macro factor estimated plotted in Figure 4 show a clear pattern of credit
cycles in the default rate data. Realized defaults rates were above average for most of the
1920s and 1930s. Realized default rates were below average for all but one year over the
period 1941 to 1969. This long credit cycle was followed by two shorter credit cycles in
addition to the current downturn. Default rates were above average in both the late 1980s
and the late 1990s and rose again beginning in 2007.
The sampling distribution estimates clearly show temporal dependence among the
common factor realizations as realized values of the factors are statistically positive or
statistically negative for extended periods of time. The Wald and Wolfowitz (1940) runs
test statistic for independence is 9.84 when calculated using the mean values of the
39
common factor sampling distributions. The common factor estimates violate a formal
nonparametric runs test for independence at any commonly used level of significance
and thus there is strong statistical evidence that the temporal independence assumption of
the Vasicek-AIRB model is miss-specified.
Aa, A, Baa Ba, B, CaaC
rated credits rated credits

mean 0.205399794 0.324310576
median 0.20525 0.3234
mode 0.22236 0.29813
std deviation 0.023963989 0.032735678
1 percent 0.15115 0.24829
5 percent 0.1669 0.2695
95 percent 0.24511 0.37813
99 percent 0.26049 0.40091
Distribution Percentiles
Table 11: Sampling Distributions for Vasicek
Correlation Parameter when High- and Low-
Quality are Modeled Separetely
Sampling distribution are based on a sample size of 75, with a
default rate lower bound of 1 basis point, and 5000 paired
sample bootstrap replications.

The bootstrap technique can also be used to test other aspects of the model
specification. The estimates reported thus far have imposed the restriction of a common
Vasicek correlation parameter across credit rating classes and this may not be
appropriate. Table11 reports summary statistics for the sampling distributions for the
Vasicek correlation parameter when the correlation parameter is allowed to differ
between investment grade (Aa, A, Baa) and sub-investment grade (Ba, B, CaaC)
groupings of the rating class portfolios. Figure 5 plots the full sampling distributions for
these alternative correlation parameter estimates based on 5000 replications of sample
40
size 75.
23
The statistics reported in Table 11 and the distributions plotted in Figure 5
impose a lower bound of 1 basis point on any default rate observations that are reported
to be 0 in the data.
The statistics reported in Table 6 and plots in Figure 5 show that the data is
consistent with at least two different correlation parameters, one for highly-rated credits,
and a different correlation parameter for lower-quality credits. The reported statistics
show that the correlation parameter for lower-quality credits is clearly greater than the
correlation parameter estimate that provides the best fit for investment grade credits.
Recall that the Basle II AIRB model imposes a regulatory correlation function in which
the correlation parameter decreases as a credits unconditional probability of default
increases. The sampling distribution plotted in the lower panel of Figure 5 is the
distribution for the difference in the correlation parameter estimates for low quality
credits and high quality credits. The sampling distribution for the difference in these
correlation parameters clearly shows that the regulatory correlation structure is
inconsistent with the data. The evidence from this Moodys data suggests that the
Vasicek model default correlation parameter increases in magnitude as credit quality
declines.

23
The sample of 75 observations represents data from 1920-2006, excluding 12 years in
which no default observations were recorded in any rating category. This correlation
section has not yet been updated from an earlier draft to include 2008 data, GLS
estimates on the full sample including the 0 default years and higher truncation
thresholds for 0 default rate observations
41
Figure 5: Sampling Distributions for Correlation Parameter Estimates
when Aa, A, Baa Credits may have a Different Correlation than Ba, B,
CaaC Credits
Sampling distribution estimates are based on an estimation sample size of 75 years of data and
5000 paired sample bootstrap replications. The lower bound on portfolio default rates are
assumed to be 1 basis point and years with no defaults on any rated credit
Correlation Estimate for Aa, A, Baa Rated Credits
0
1
2
3
4
5
0
.
1
0
.
1
3
0
.
1
5
0
.
1
6
0
.
1
8
0
.
1
9
0
.
2
0
.
2
2
0
.
2
3
0
.
2
5
0
.
2
6
0
.
2
8
0
.
3
2
0
.
4
4
p
r
o
b
a
b
i
l
i
t
y

(
%
)
Correlation Estimate for Ba, B, CaaC Rated Credits
0
1
2
3
4
5
0
.
1
0
.
2
1
0
.
2
2
0
.
2
4
0
.
2
6
0
.
2
8
0
.
3
0
.
3
2
0
.
3
4
0
.
3
6
0
.
3
8
0
.
3
9
0
.
4
1
p
r
o
b
a
b
i
l
i
t
y

(
%
)
Ba, B, CaaC Correlation Estimate Less Correlation
Estimate for Aa, A, Baa Rated Credits
0
1
2
3
4
5
6
0
.
0
1
0
.
0
2
0
.
0
4
0
.
0
5
0
.
0
7
0
.
0
9
0
.
1
0
.
1
2
0
.
1
3
0
.
1
5
0
.
1
6
0
.
1
8
0
.
1
9
0
.
2
1
p
r
o
b
a
b
i
l
i
t
y

(
%
)

Figures 6 and 7 provide some additional visual evidence regarding model fit.
Figure 6 plots the actual and predicted portfolio default rates for Moodys investment
42
grade credits. The investment grade data includes a large number of zero default rate
observations. When the investment grade credit portfolios experience higher default rates
(in the 1920s and 1930s) the model produces elevated default rate predictions, but many
of these predictions fall far short of the actual recorded default rates. In a period
beginning in the late 1960s, the model predicts an elevated level of investment grade
defaults that only materializes for Baa-rated credits, and here the models default rate
predictions are small relative to the default rates recoded on Baa-rated credits.
Figure 7 provides the actual and predicted portfolio default rates for Moodys
sub-investment grade credits. As the plots show, the Vasicek-Basle AIRB model fit
deteriorates for sub-investment grade credits. For the lowest rated credits, B and CaaC,
errors are large and concentrated in the period since the 1970s. The large error rates from
sub-investment grade credits likely owes at least in part to an inappropriate restriction on
this groups correlation parameter. The model estimates in Figures 6 and 7 impose a
uniform correlation parameter for all credit rating classes, whereas the evidence reported
in Table 6 and Figure 5 show that sub-investment grade credits have a larger correlation
parameter.
The model prediction errors depend in part on the truncation value selected for
zero default rate observations. In general, within a reasonable range of truncation values,
as the truncation value for zero default rate observations is increased, the unconditional
default rate estimates increase and the model correlation parameter estimate decreases.
While I have not done an exhaustive analysis of alternative truncation values, and the
best truncation value may be grade specific and depend on the number of obligors in each
credit grade, a uniform 20 basis point truncation value seems to be reasonable
43
compromise relative to an objective of minimizing root mean-squared prediction errors
across the credit grades.
Rating 20 bps 50bps
Aa 24.4 34.6
A 25.1 35.4
Baa 30.4 36.2
Ba 112.6 112.5
B 328.3 317.9
CaaC 1593.1 1563.7
Table 12: Model Prediction Error Rates
for Alternative Zero Default Rate
Tuncation Choices
RMSE for truncation value
RMSE is the root mean-square Vasicek model
prediction error measured in basis points using
Moody's Corporate default rate data, 1920-2008.


VIII. VASICEK-BASEL II AIRB PARAMETER ESTIMATION FROM SMALL SAMPLES
Few if any financial institutions have data on the default rate performance of their
internal rating systems over 89 years. Indeed because of data limitations on banks own
internal ratings system performance, the Basel II AIRB approach requires as little as five
years of data as the minimum acceptable sample length that a bank may use for
estimation of some AIRB model parameters. An important issue of regulatory concern
has been the quality of the parameter estimates that may be generated from such small
samples. It has long been appreciated that short time series on ratings system
performance are unlikely to include data from a full credit cycle.

44
Figure 6: Predicted and Actual Default Rates for Moody's Aa-, A- and Baa-Rated
Corporate Credits, 1920-2008, 20 basis points truncation
Moody's Aa Credits
0
0.005
0.01
0.015
0.02
0.025
1
9
2
0
1
9
2
4
1
9
2
8
1
9
3
2
1
9
3
6
1
9
4
0
1
9
4
4
1
9
4
8
1
9
5
2
1
9
5
6
1
9
6
0
1
9
6
4
1
9
6
8
1
9
7
2
1
9
7
6
1
9
8
0
1
9
8
4
1
9
8
8
1
9
9
2
1
9
9
6
2
0
0
0
2
0
0
4
2
0
0
8
actual
predicted
Moody's A Credits
0
0.005
0.01
0.015
0.02
0.025
0.03
1
9
2
0
1
9
2
4
1
9
2
8
1
9
3
2
1
9
3
6
1
9
4
0
1
9
4
4
1
9
4
8
1
9
5
2
1
9
5
6
1
9
6
0
1
9
6
4
1
9
6
8
1
9
7
2
1
9
7
6
1
9
8
0
1
9
8
4
1
9
8
8
1
9
9
2
1
9
9
6
2
0
0
0
2
0
0
4
2
0
0
8
Moody's Baa Credits
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
1
9
2
0
1
9
2
4
1
9
2
8
1
9
3
2
1
9
3
6
1
9
4
0
1
9
4
4
1
9
4
8
1
9
5
2
1
9
5
6
1
9
6
0
1
9
6
4
1
9
6
8
1
9
7
2
1
9
7
6
1
9
8
0
1
9
8
4
1
9
8
8
1
9
9
2
1
9
9
6
2
0
0
0
2
0
0
4
2
0
0
8

45
Figure 7: Predicted and Actual Default Rates for Moody's Ba-, B- and CaaC-
Rated Corporate Credits, 1920-2008, 20 basis points truncation
Moody's Ba Credits
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
1
9
2
0
1
9
2
4
1
9
2
8
1
9
3
2
1
9
3
6
1
9
4
0
1
9
4
4
1
9
4
8
1
9
5
2
1
9
5
6
1
9
6
0
1
9
6
4
1
9
6
8
1
9
7
2
1
9
7
6
1
9
8
0
1
9
8
4
1
9
8
8
1
9
9
2
1
9
9
6
2
0
0
0
2
0
0
4
2
0
0
8
actual
predicted
Moody's B Credits
0
0.05
0.1
0.15
0.2
0.25
0.3
1
9
2
0
1
9
2
4
1
9
2
8
1
9
3
2
1
9
3
6
1
9
4
0
1
9
4
4
1
9
4
8
1
9
5
2
1
9
5
6
1
9
6
0
1
9
6
4
1
9
6
8
1
9
7
2
1
9
7
6
1
9
8
0
1
9
8
4
1
9
8
8
1
9
9
2
1
9
9
6
2
0
0
0
2
0
0
4
2
0
0
8
Moody's CaaC Credits
0
0.2
0.4
0.6
0.8
1
1.2
1
9
2
0
1
9
2
4
1
9
2
8
1
9
3
2
1
9
3
6
1
9
4
0
1
9
4
4
1
9
4
8
1
9
5
2
1
9
5
6
1
9
6
0
1
9
6
4
1
9
6
8
1
9
7
2
1
9
7
6
1
9
8
0
1
9
8
4
1
9
8
8
1
9
9
2
1
9
9
6
2
0
0
0
2
0
0
4
2
0
0
8

46
Small sample estimates of unconditional default rate inputs into AIRB regulatory
capital calculations may be downward biased in the samples do not include sufficient
data on a bust phase of the credit cycle. To correct for this potential shortcoming, The
Basel AIRB approach requires that data used to estimate unconditional default rates
either include data from a bust phase of the credit cycle or alternatively include some
other technique to adjust unconditional default rates so that they reflect downturn
conditions.
In this section, I present concrete evidence on the importance of the small sample
bias that may occur when the Vasicek-Basel II AIRB model parameters are estimated
from a short time series on model performance. I estimate the sampling distributions for
the parameter estimates of the Vasicek-Basel II AIRB model when the parameters are
estimated from sample sizes of 5- and 10 years of data. I compare these small sample
distributions to the sampling distribution for parameter estimates from the full sample of
89 years of data. I estimate the small sample distributions using the jackknife procedure
in which I resample 5000 replications using paired observations in order to preserve
heteroskedasticity characteristics in the data.
47
Figure 8: Sampling Distributions for the Unconditional Probability of Default on
Moodys Aa-Rated Credits for Alternative Sample Sizes Drawn from 1920-2008
Data, Zero Default Rates Truncated to 20 bps
23.5 24.5 25.5 26.5 27.5 28.5 29.5 30.5 31.5 32.5 33.5 34.5 35.5
0
2
4
6
8
10
12
P
e
r
c
e
n
t
Aaprob

20 60 100 140 180 220 260 300 340 380 420 460 500 540 580 620 660 700 740 780 820 860 900
0
10
20
30
40
50
60
70
80
P
e
r
c
e
n
t
Aaprob


48

22.5 52.5 82.5 112.5 142.5 172.5 202.5 232.5 262.5 292.5 322.5 352.5 382.5
0
10
20
30
40
50
60
70
P
e
r
c
e
n
t
Aaprob


The sampling distributions for Vasicek model unconditional default rate
parameter estimates for Moodys Aa-Rated credits derived from alternative sample sizes
are plotted in Figure 8. The plots in Figure 8 show a clear pattern in which small sample
parameter estimates are downward biased relative to the parameter estimates derived
from long time series. Small sample parameter estimate sampling distributions exhibit a
strong left skew with high probabilities that a sample estimates will indicate only a
minimal unconditional probability of default when true underlying default rates are far
higher. The magnitude of the bias is greater for smaller sample sizes. Unconditional
default rate estimates generated from samples even as long as 10 years are likely to
produce strongly downward biased estimates of the unconditional default rates.
49
Figure 9: Sampling Distributions for the Unconditional Probability of Default on
Moodys Baa-Rated Credits for Alternative Sample Sizes Drawn from 1920-2008
Data, Zero Default Rates Truncated to 20 bps

27.6 30 32.4 34.8 37.2 39.6 42 44.4 46.8 49.2 51.6 54
0
2
4
6
8
10
12
14
P
e
r
c
e
n
t
Baaprob
20 60 100 140 180 220 260 300 340 380 420 460 500 540 580 620 660 700 740 780 820 860 900 940 980 1020
0
10
20
30
40
50
60
P
e
r
c
e
n
t
Baaprob

50

15 30 45 60 75 90 105 120 135 150 165 180 195 210 225 240 255 270 285 300 315 330 345
0
5
10
15
20
25
30
35
40
45
P
e
r
c
e
n
t
Baaprob

Figure 9 plots the sampling distribution for the unconditional default rate on
Moodys Baa-rated credits for alternative sample sizes. Figure 9 shows a pattern that is
very similar to the pattern evident in Figure 8. Figures 8 and 9 clearly demonstrate that is
a very high probability that Vasicek model estimates based on small sample sizes will
significantly understate the unconditional probability of default associated with a
Moodys corporate credit grade. Figures 10 and 11 (in the Appendix) show that this
pattern also holds for Moodys A- and B-rated corporate credits.
The downward bias in small sample estimates is a consequence of at least two
important features of the Moodys corporate default rate data: (1) the prevalence of zero
reported default rates, and (2) the strong positive autocorrelation in the common macro
factor that drives default. The jackknife random sampling technique for drawing small
samples will potentially draw a large share of zero default rates given their weights in the
data. Since the macro factor is strongly autocorrelated, it will require a very long time
series before the macro factor is likely to have a sample average of zero. Because model
51
identification is achieved by imposing the zero mean condition, macro factor estimates
are likely to be biased in small samples. This bias will induce a bias in the unconditional
default rate estimates as well.
This biases that are demonstrated in the small sample results in Figures 8-11
likely understates the estimation issues associated with small sample estimates that are
likely to be derived in practice as true small samples will have positively temporally
correlated observations and so most of the macro factor draws in a small sample are
likely to be either positive or negative. In such a case, the model restriction that the
macro factor average draw be zero over the sample will induce a larger bias in the macro
factor estimates compared to those produced under the jackknife random sampling
techniques that underpin Figures 8-11. This additional issue does not necessarily arise in
the small sample parameter distributions plotted in Figures 8-11 because each
observation in a jackknife sample is chosen at random from the entire times series, so the
underlying autocorrelation structure that is evident in the raw data is not necessarily
preserved.

IX. A METHOD FOR CORRECTING THE BIAS IN SMALL SAMPLE ESTIMATES
This econometric implementation of the Vasicek model can be modified to derive
consistent estimates of the unconditional default rates associated with additional grades
or auxiliary rating systems that may not have a long time series of default rate realization.
The consistency of the estimator is conditional on the assumption that the default rate and
correlation factor associated with the credit rating class are identical to those that
characterize the credit grading system for which there exists long time series of dfault
rates--for example, like the Moodys 1920-2008 default data sample.
52
Recall that the Vasicek model implies,

( )
it Mt
A
jt
e
PD
P c

~
1 1
)
~
(
1
1
+

u
= u

(14)

In the econometric implementation of specification (14), the year fixed-effect estimates
are estimates of the macro factor realizations,
Mt t
e b

=
1

. If we use estimates of
the macro factors realizations that are constructed from a long time series, these estimates
are consistent estimates of the underlying macro factor realizations (assuming the
Vasicek model is true). We can use these macro factor estimates to estimate the
unconditional default rates associated with a different rating system that may not have a
long sample of default rate realizations.
Let
A
PD represent the unconditional probability of default on an auxiliary rating
category for which we have a data on default rates with a sample size of S where S is a
relative modest sample size. From expression (14), it is evident that,

( )
it
A
t jt
PD
b P c

~
1

)
~
(
1
1
+

u
= u

, (15)

Consequently, as size of the small sample on the alternative rating grade increases,


( )
( )

u
u

=

)
~
(
1
.. .
1 A
s a
S
q t
t jt
PD
b P . (16)

From expression (16), it is possible to construct a consistent estimate of the
unconditional default rate of the new rating class that controls for the macro factor
53
conditions that drive defaults, ( )
|
|
.
|

\
|
u u

=

S
q t
t jt
b P

)
~
( 1
1
is an estimator for the
unconditional default rate for the rating class,
( )
A s a
S
q t
t jt
PD b P
|
|
.
|

\
|
u u

=

. .
1

)
~
( 1 , (17)
where
A
PD is the unconditional probability of default on the auxiliary rating class.
Year
2001 -0.19447
2002 -0.27158
2003 -0.04959
2004 0.09941
2005 0.15243
2006 0.15585
2007 0.24216
2008 0.17973
macro factor
estimate
Table 13: Macro Factor
Estimates from Moody's
Corporate Default Data 1920-
2008
Estimates are based the Vasicek
model specification (Table 6),
with zero default rates
truncated to 20 basis points.

I apply this macro factor adjustment algorithm to estimate the unconditional
default rates for Moodys alpha-numeric rating scale. While Moodys published default
rate statistics on these grades from the early 1980s, I use data from 2001-2008 to
demonstrate the adjustment. I exclude rating grades which exhibit no defaults over this
sample period, and I truncate zero default rates to 20 basis points. The correlation and
macro factor adjustments used in (17) are taken from the Vasicek model estimates
derived from the Moodys 1920-2008 data on letter rating grades when zero default rates
are truncated to 20 basis points (Table 6). The macro factors are reported in Table 13.
54
Over this period, there are 4 years in which macro effects work to increase default rates
(2001-2003 and 2008) and 4 years in which the common factor reduced default rates
(2004-2007).
Moody's
alpha-
numeric
rating
Aa3 18.51 32.84
A1 12.36 31.01
A2 5.43 27.81
A3 5.56 27.89
Baa1 22.56 34.46
Baa2 25.05 37.21
Baa3 30.53 37.44
Ba1 33.51 39.31
Ba2 40.25 47.01
Ba3 117.98 102.50
B1 100.84 80.22
B2 252.71 157.20
B3 508.04 364.75
Caa1 921.98 758.48
Caa2* 1608.40 1460.38
Caa3 2556.39 2495.40
Sample
unconditional
probability of
default estimate
Model estimate of
unconditional
probability of default
Table 13: Alternative Estimates of the Unconditional
Default Rates Associated with Moody's Alpha-
Numeric Rating Grades, 2001-2008
Model estimates use estimates of the macro economic
factor realizations over the 2001-2008 period and the
model correlation paramter estimate derived from the
Moody's letter rating grade model estimated over the
sample period 1920-2008.


Table 14 reports the sample unconditional mean default rate by rating grade as
well as estimates of the rating grade unconditional default rates that adjust for the effect
of the macro factor using expression (17). The macro factor adjust has a nonlinear effect
on the unconditional probability of default estimates. Because of the non-linearity of the
cumulative normal distribution function, the macro factor adjustments have a much larger
55
impact when the reported default rates are small. Overall, the results suggest that the
simple sample mean default rate underestimates the unconditional default rates on higher
quality credits (Aa3-Ba2) and over estimates the unconditional default rate on the lower
quality credits (Ba3-Caa3).

X. CONCLUSIONS
This paper has developed a new approach model that uses standard panel
regression techniques to estimate the parameters of the asymptotic Vasicek portfolio
default rate model that is used as the basis for the Basel AIRB regulatory capital
framework. The approach produces consistent estimates of all the model parameters
using time series data on a cross section of the failures rates from a consistent credit
rating system. The approach is novel in that it produces consistent estimates of the
Vasicek/AIRB correlation parameter directly from the default rate data without any need
to use stock return or other data and methods of inference. The approach can be used to
estimate multiple correlation parameters for a rating system. Estimates suggest that, in
contrast to Basel II AIRB assumptions, lower grade (sub-investment grade) credits have a
substantially higher Vasicek correlation parameter compared to investment grade credits.
Because the new approach to parameter estimation is based on standard
econometric techniques, the approach introduces a new battery of test statistics and
model diagnostic tools into Basel AIRB calibration discussion. For example, bootstrap
methods are used to calculate sampling distributions and exact small sample test statistics
for Basel AIRB model parameters. The results suggest that the unconditional default
probability inputs (and the correlation parameter) into the Basel AIRB framework can be
estimated, but the accuracy of the estimated values hinges on a number of important
56
considerations. The method selected to handle 0 default rate observations in the
estimation process is a particularly important issue that may not have received adequate
attention in the literature. Sample size is also a very important issue albeit a well-known
issue but an issue that could benefit from additional study. Model parameters estimated
derived from small samples are biased and the analysis in this paper helps to quantify the
magnitudes of the potential bias.
This new econometric approach also provides consistent estimates of the
common macro factor that is assumed to drive credit defaults in the Vasicek-AIRB
model. Statistical analysis of the macro factor times series derived from the Moodys
data provides strong evidence of a credit cycle in corporate default rates. This macro
factor time series properties are inconsistent with Vasicek-Basel II AIRB model which
assumes that common factor realizations are independent. The macro factor estimates that
are generated can be used to correct small sample biases that arise when unconditional
default rates are estimated using only brief time series histories that may not include full
or balanced credit cycles. When this adjustment is applies to estimate the unconditional
default rates associated with selected grades from Moodys alpha-numeric rating scale
using data from 2001-2008, the results suggest that simple default arte sample averages
by credit grade understate the unconditional probabilities of default associated with high
quality grades and over-estimate these probabilities for the more risky credit classes.
57

XI. APPENDIX
P1 P2 P3 P4
P1 1.000 1.000 0.999 0.998
P2 1.000 1.000 0.999
P3 1.000 1.000
P4 1.000
P1 P2 P3 P4
P1 1.000 0.994 0.988 0.972
P2 1.000 0.999 0.992
P3 1.000 0.996
P4 1.000
P1 P2 P3 P4
P1 1.000 0.991 0.984 0.961
P2 1.000 0.999 0.989
P3 1.000 0.995
P4 1.000
P1 P2 P3 P4
P1 1.000 0.990 0.981 0.952
P2 1.000 0.999 0.986
P3 1.000 0.993
P4 1.000
Table A1: Simulated Asymptotic Portfolio Sample Default Rate Correlations Under
Different Asumptions for the Vasicek Correlation Parameter
(1000 observations)
01 . 0 =
10 . 0 =
20 . 0 =
40 . 0 =


58
Figure 10: Sampling Distributions for the Unconditional Probability of Default on
Moodys A-Rated Credits for Alternative Sample Sizes Drawn from 1920-2008
Data, Zero Default Rates Truncated to 20 bps
23.1 24.3 25.5 26.7 27.9 29.1 30.3 31.5 32.7 33.9 35.1 36.3 37.5
0
2
4
6
8
10
12
14
P
e
r
c
e
n
t
Aprob



20 60 100 140 180 220 260 300 340 380 420 460 500 540 580 620 660 700 740 780 820 860 900 940
0
10
20
30
40
50
60
70
80
P
e
r
c
e
n
t
Aprob


59
22.5 52.5 82.5 112.5 142.5 172.5 202.5 232.5 262.5 292.5 322.5 352.5
0
10
20
30
40
50
60
70
P
e
r
c
e
n
t
Aprob

60
Figure 11: Sampling Distributions for the Unconditional Probability of Default on
Moodys B-Rated Credits for Alternative Sample Sizes Drawn from 1920-2008
Data, Zero Default Rates Truncated to 20 bps

125 135 145 155 165 175 185 195 205 215 225 235 245 255 265 275 285 295 305 315 325 335 345
0
2
4
6
8
10
12
14
P
e
r
c
e
n
t
Bprob
40 200 360 520 680 840 1000 1160 1320 1480 1640 1800 1960
0
5
10
15
20
25
30
35
P
e
r
c
e
n
t
Bprob

61
25 125 225 325 425 525 625 725 825 925 1025 1125 1225
0
2.5
5.0
7.5
10.0
12.5
15.0
17.5
P
e
r
c
e
n
t
Bprob

62
References
Cameron and Trivedia (2005). Microeconometrics: Methods and Applications. New
York: Cambridge University Press.

Chernick, Michael R. (1999). Bootstrap Methods, A practitioner's guide. Wiley Series in
Probability and Statistics.

Davison, A. C.; Hinkley, D. Bootstrap Methods and their Applications. (1997). Bootstrap
Methods and their Applications. Cambridge: Cambridge Series in Statistical and
Probabilistic Mathematics.

Davison, A. C.; Hinkley, D. Bootstrap Methods and their Applications. (2006). Bootstrap
Methods and their Applications (8th ed.). Cambridge: Cambridge Series in Statistical and
Probabilistic Mathematics.

Diaconis, P. & Efron, B. (May 1983). "Computer-intensive methods in statistics".
Scientific American: 116130.

Efron, B. (1979). "Bootstrap Methods: Another Look at the Jackknife". The Annals of
Statistics 7 (1): 126.

Financial Stability Institute Occasional Paper No. 6 (September 2006). Implementation
of the new capital adequacy framework in non-Basel Committee member countries,
Bank for International Settlements, http://www.bis.org/fsi/fsipaper06.pdf

Gordy, Michael (2003). A Risk-Factor Model Foundation for Ratings-based Bank
Capital Rules. Journal of Financial Intermediation, Vol. 12, pp. 199232.

Kupiec, P. (2007). Financial Stability and Basel II, Annals of Finance, Vol. 3, pp. 107-
130.

Kupiec, P. (2007). Capital Allocation for Portfolio Credit Risk, The Journal of
Financial Services Research, Vol. 32,
No. 1-2, p. 103-122.

Kupiec, P. (2008). Basel II: A Case for Recalibration, in Handbook of Financial
Intermediation and Banking, Anjan Thakor and Arnoud Boot, editors. New York: North
Holland.

Kupiec, P. (2008). A Generalized Single Common Factor Model of Portfolio Credit
Risk, The Journal of Derivatives, Vol. 15, No. 3, pp. 25-40.

63
Lando, David and Torben Skdeberg, (2002). Analyzing rating transistions and rating
drift with continuous observations, Journal of Banking and Finance, Vol. 26, pp. 423-
444.

Moodys Investor Service, (2007). Corporate Default and Recovery Rates, 1920-2006.
(February).

Moodys Investor Service, (2008). Corporate Default and Recovery Rates, 1920-2007.
(February).
Schnbucher, P. (2001). Factor Models: Portfolio Credit Risks When Defaults Are
Correlated. Journal of Risk Finance, Vol. 3, No. 1, pp. 4556.

Vasicek, O. (1987). Probability of Loss on a Loan Portfolio. KMV, Working Paper.
Published (2003) as Loan Portfolio Value. Risk, December, pp. 160162.

Wald, A. and Wolfowitz, J. (1940), "On a test whether two samples are from the same
population," Ann. Math Statist. 11, 147-162.

The Basel Committee on Banking Supervision (2004). International Convergence of
Capital Measurement and Capital Standards. Bank for International Settlements, June.

Вам также может понравиться