Вы находитесь на странице: 1из 26

CAPM: Do you want fries with that? Investment decision making ultimately comes down to questions of risk .

How should risk be assessed? How much risk should we take to obtain a given return? What types of risk are rewarded and what types are not? The capital asset pricing model (CAPM) is the standard model representing the relationship between risk and return. CAPM states that risk is measured by the variance in the returns, so that the expected return of an investment represents the reward, while the variance of returns is the risk. In this representation of reality, given two investments with the same expected return but dierent variances, an investor will always choose the investment with smaller variance. Similarly, given two investments with the same variance of returns but dierent expected returns, an investor will always choose the investment with higher expected return. Under the CAPM model, all variance is risk, but not all risk is rewarded. For any asset, risk comes from two sources: eects that come from the specic actions of the asset manager (which aect only that asset), and marketwide movements (which aect all assets). Since marketwide eects will aect all assets, they cannot be diversied away. On the other hand, assetspecic components of risk will cancel out with each other if a large portfolio of assets is constructed, so under CAPM they are not rewarded. That is, under CAPM only variability related to market variability (the systematic risk or nondiversiable risk ) is rewarded. Under CAPM, the expected return on an asset R can be written as a function of the riskfree rate Rf (the return on the riskless asset, which has no variance; this is typically taken to be a short or longterm bond rate, such as the 3month Treasury bills rate) and the expected return of the market E (Rm ): E (R) = Rf + (E [Rm ] Rf ) = Rf (1 ) + (E [Rm ]), (1)

where is the beta of the asset, the covariance of the assets returns with the market returns divided by the market return variance. This function is called the security market line. c 2011, Jerey S. Simono 1

The beta of a security is of interest to an investor, as it measures the relative risk of the security compared with the market (a beta greater than one indicates a riskier than average security, while a beta less than one is consistent with a safer than average security). The beta can be estimated using a regression model relating stock returns to market returns, Ri = 0 + 1 Rmi + i , (2)

with V (i ) = 2 . Comparing this regression equation to (1) shows that the estimate of the slope is an estimate of beta. The estimated constant term can be compared to ) to see how the stock performed relative to the prediction of performance using Rf (1 CAPM. (Technically, this is called the SharpeLintner version of CAPM; the Black version replaces Rf (1 ) in equation (1) with E (R0m )(1 ), where R0m is the return on the socalled zerobeta portfolio, the portfolio that has the minimum variance of all portfolios uncorrelated with the market portfolio of assets.) The R2 of the regression, which estimates the proportion of the variability in the security accounted for by the market, estimates the market (nondiversiable) risk of the security. The data examined here are the monthly returns for the McDonalds Food Corporation. The data cover November 1988 through March 1996, or 89 months. The market return is measured using the New York Stock Exchange Composite Index. Here are the values: Row 1 2 3 4 5 6 7 8 9 10 11 12 13 Date 8811 8812 8901 8902 8903 8904 8905 8906 8907 8908 8909 8910 8911 McDonalds return -0.042501 0.021505 -0.001347 0.079096 -0.009143 0.048028 0.084656 0.016789 0.017058 -0.016772 0.011736 0.018951 0.040373 Market return -0.020172 0.028091 0.037379 0.041268 -0.017024 0.024549 0.052701 0.020975 0.037563 0.076022 0.009559 -0.048047 -0.019689 2

c 2011, Jerey S. Simono

14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57

8912 9001 9002 9003 9004 9005 9006 9007 9008 9009 9010 9011 9012 9101 9102 9103 9104 9105 9106 9107 9108 9109 9110 9111 9112 9201 9202 9203 9204 9205 9206 9207 9208 9209 9210 9211 9212 9301 9302 9303 9304 9305 9306 9307

0.076532 -0.054485 0.017563 -0.025920 -0.005864 0.021756 0.098020 0.010572 -0.197538 -0.069716 0.002323 0.056075 0.039858 -0.045752 0.103667 0.096968 0.027639 -0.025118 -0.023000 -0.002770 -0.020769 0.010605 0.071503 -0.016911 0.022588 0.194721 0.019247 -0.034140 0.006742 0.063522 0.028110 -0.009537 -0.055916 0.038035 -0.030287 0.087144 0.046727 0.000653 0.020408 0.050000 -0.065486 0.001916 0.008910 -0.011977

0.025833 -0.028048 -0.030466 0.034622 0.007355 0.019589 0.015266 -0.013860 -0.088701 -0.050428 -0.024987 0.012492 0.038468 0.006911 0.115468 0.007186 0.005604 0.007187 0.018338 -0.008401 0.013011 -0.004602 0.021460 0.003192 -0.029701 0.095603 0.006729 -0.002026 -0.000769 0.023114 -0.014667 -0.005648 -0.011874 -0.001818 -0.016921 0.029031 0.022980 0.010092 0.032486 0.021991 0.007931 0.000844 -0.002222 0.010511 3

c 2011, Jerey S. Simono

58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89

9308 9309 9310 9311 9312 9401 9402 9403 9404 9405 9406 9407 9408 9409 9410 9411 9412 9501 9502 9503 9504 9505 9506 9507 9508 9509 9510 9511 9512 9601 9602 9603

0.100776 0.000000 -0.001149 0.039424 0.030143 -0.002184 0.051041 0.003107 -0.057672 0.041512 0.006816 -0.042189 -0.075072 0.025863 0.004587 0.065059 -0.020339 0.021881 0.132760 0.047243 0.007220 0.044817 0.033427 0.013278 -0.025370 0.052920 0.012769 0.085914 0.045700 0.016655 0.112645 -0.000628

0.026065 0.003056 0.004092 0.012023 0.014189 0.026788 0.000882 -0.028628 -0.052835 -0.001732 0.007713 -0.009808 0.009913 0.003630 -0.017590 -0.011684 -0.017653 0.039433 0.024697 0.014850 0.037510 0.026078 0.030091 0.053668 -0.004127 0.029195 0.001223 0.023503 0.040412 -0.005367 0.050522 0.005572

The use of monthly returns is quite typical in CAPM calculations, but the 7 1 2 year time period is a bit longer than is typical (for example, Value Line and Standard and Poors use ve years of data, while Bloomberg uses two). CAPM implies a linear relationship between McDonalds returns and market returns, which looks reasonable here:

c 2011, Jerey S. Simono

0.2

McDonalds return

0.1

0.0

-0.1

-0.2 -0.1 0.0 0.1

Market return

There is one noteworthy month at the lower left, which is case 22 (August 1990). This was at the beginning of a recession, and while the market did poorly (a 9% drop), McDonalds did particularly poorly (a 20% drop). Its not too surprising that a company that specializes in fast food (hardly a staple item) would suer in a recession, and McDonalds did; its longterm debt was $4.4 billion in 1990, its highest value ever up through early 1996. Here are the results of a regression t. Regression Analysis The regression equation is McDonalds return = 0.00735 + 1.09 Market return Predictor Constant Market r S = 0.04171 Coef 0.007351 1.0893 SE Coef 0.004641 0.1503 T 1.58 7.25 P 0.117 0.000

R-Sq = 37.7%

R-Sq(adj) = 36.9%

Analysis of Variance

c 2011, Jerey S. Simono

Source Regression Error Total

DF 1 87 88

SS 0.091398 0.151328 0.242726

MS 0.091398 0.001739

F 52.55

P 0.000

The estimate of beta is 1.089; while this is greater than one (indicating a riskier than average stock), it is not signicantly greater than one, as a ttest for the hypothesis H0 : 1 = 1 is t= 1.0893 1 = .59. .1503

R2 = .377, leaving 62.3% diversiable risk. This value of market (nondiversiable) risk is a bit higher than is typical for U.S. stocks, since market risk averages about 27.0% in the U.S. market (it averages about 35% for U.K. stocks, 45% for German stocks, and 60% for the Taiwanese stock market). Does the least squares model t these data? Here are some regression diagnostics. Note that August 1990 is apparently an outlier / leverage / inuential point; August 1989, February 1991, and January 1992 also show up as possibly problematic. Data Display

Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Date 8811 8812 8901 8902 8903 8904 8905 8906 8907 8908 8909 8910 8911 8912 9001

SRES1 -0.67611 -0.39749 -1.19775 0.65035 0.04970 0.33654 0.48577 -0.32366 -0.75655 -2.65715 -0.14534 1.57633 1.32082 0.99137 -0.76136

HI1 0.022593 0.015770 0.021396 0.024418 0.020305 0.014214 0.035575 0.012974 0.021530 0.068855 0.011236 0.054091 0.022226 0.014740 0.029448

COOK1 0.005283 0.001266 0.015683 0.005293 0.000026 0.000817 0.004352 0.000688 0.006297 0.261047 0.000120 0.071046 0.019828 0.007352 0.008794 6

c 2011, Jerey S. Simono

16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59

9002 9003 9004 9005 9006 9007 9008 9009 9010 9011 9012 9101 9102 9103 9104 9105 9106 9107 9108 9109 9110 9111 9112 9201 9202 9203 9204 9205 9206 9207 9208 9209 9210 9211 9212 9301 9302 9303 9304 9305 9306 9307 9308 9309

1.05761 -1.71889 -0.51187 -0.16732 1.78572 0.44332 -2.79304 -0.54669 0.53931 0.84681 -0.22787 -1.46207 -0.76968 1.97225 0.34204 -0.97171 -1.21419 -0.02342 -1.01992 0.19962 0.98415 -0.66903 1.15927 2.11255 0.11011 -0.94805 0.00553 0.74825 0.88922 -0.25925 -1.21729 0.78831 -0.46522 1.16446 0.34628 -0.42658 -0.54036 0.45122 -1.96467 -0.15330 0.09605 -0.74217 1.57097 -0.25758

0.031876 0.019493 0.011290 0.012583 0.011682 0.018263 0.136198 0.057717 0.026592 0.011360 0.022203 0.011317 0.157294 0.011300 0.011424 0.011300 0.012272 0.015352 0.011405 0.013783 0.013123 0.011737 0.031091 0.107705 0.011329 0.012932 0.012580 0.013676 0.018759 0.014178 0.017115 0.012871 0.020234 0.016237 0.013629 0.011242 0.018153 0.013293 0.011264 0.012187 0.012991 0.011252 0.014840 0.011759

0.018414 0.029369 0.001496 0.000178 0.018846 0.001828 0.615005 0.009153 0.003973 0.004120 0.000590 0.012234 0.055288 0.022228 0.000676 0.005396 0.009159 0.000004 0.006000 0.000278 0.006440 0.002658 0.021562 0.269347 0.000069 0.005888 0.000000 0.003882 0.007558 0.000483 0.012901 0.004051 0.002235 0.011190 0.000828 0.001035 0.002699 0.001371 0.021987 0.000145 0.000061 0.003134 0.018588 0.000395 7

c 2011, Jerey S. Simono

60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89

9310 9311 9312 9401 9402 9403 9404 9405 9406 9407 9408 9409 9410 9411 9412 9501 9502 9503 9504 9505 9506 9507 9508 9509 9510 9511 9512 9601 9602 9603

-0.31250 0.45758 0.17691 -0.93543 1.03083 0.65592 -0.18482 0.86994 -0.21549 -0.93920 -2.24787 0.35112 0.39732 1.70344 -0.20497 -0.68952 2.37893 0.57198 -0.99361 0.21884 -0.16210 -1.28342 -0.68139 0.33280 0.09859 1.27870 -0.13765 0.36586 1.22558 -0.33879

0.011602 0.011325 0.011533 0.015159 0.012179 0.030016 0.061532 0.012846 0.011273 0.016029 0.011239 0.011669 0.020697 0.017010 0.020742 0.022943 0.014272 0.011621 0.021492 0.014845 0.016792 0.036674 0.013613 0.016321 0.012105 0.013817 0.023719 0.014069 0.033186 0.011427

0.000573 0.001199 0.000183 0.006735 0.006551 0.006657 0.001120 0.004924 0.000265 0.007185 0.028719 0.000728 0.001668 0.025107 0.000445 0.005582 0.040971 0.001923 0.010842 0.000361 0.000224 0.031354 0.003204 0.000919 0.000060 0.011454 0.000230 0.000955 0.025779 0.000663

We could now try to address potential model violations relative to the OLS model. For example, August 1990 might be removed, and we would reanalyze without it. Rather than do that, however, Id like to raise a dierent question: is August 1990 really unusual? Its further from the regression line than we would expect under OLS assumptions, but there is good reason to doubt one of those assumptions here the assumption of constant variance of the errors. If August 1990 corresponds to an observation with inherently larger residual variance, then its observed McDonalds return might not be unusually low at all. c 2011, Jerey S. Simono 8

Why might we expect nonconstant variance here? It comes from a crucial CAPM assumption: that the beta is constant over the entire 7 1 2 year time period. This is unlikely to be true, as there is ample empirical evidence that betas change over time. If we t a model with a constant beta to data consistent with changing beta, this will show up as nonconstant variance of a specic type. Lets consider a simple example: say there are two possible beta values for a given month, 1 + c and 1 c (obviously we could choose 1 and c to represent the two values this way). The true underlying regression relationships are Ri = 0 + (1 + c)Rmi + i with probability .5, and Ri = 0 + (1 c)Rmi + i with probability .5. Under this model, we have E (Ri ) = .5[0 + (1 + c)Rmi ] + .5[0 + (1 c)Rmi ] = 0 + 1 Rmi ; that is, on average the asset returns satisfy the CAPM formula (2). However, what are the variances of the errors, E [Ri E (Ri ) | Rmi ]2 ? For group (3a), we have V (i ) = E [Ri E (Ri ) | Rmi ]2 = E [0 + (1 + c)Rmi + i {0 + 1 Rmi }]2 = E [cRmi + i ]2
2 = c 2 R2 mi + .

(3a)

(3b)

For group (3b), we have V (i ) = E [Ri E (Ri ) | Rmi ]2 = E [0 + (1 c)Rmi + i {0 + 1 Rmi }]2 = E [cRmi + i ]2
2 = c 2 R2 mi + .

c 2011, Jerey S. Simono

That is, if the true beta varies in this way, the variance of the errors is 2 + c2R2 mi ; we have heteroscedasticity, with the observed variance being a quadratic function of the market return. We can look at a plot of the absolute residuals from the OLS t versus the market return values to see if nonconstant variance of this form is indicated. Here is a plot, with a lowess curve superimposed. This curve is an example of what is called a nonparametric regression estimate. Basically, it puts a smooth curve through the data points to help suggest structure that might not otherwise show up very clearly (it does this by tting straight lines locally, rather than one straight line globally). The quadratic form of the nonconstant variance is very obvious.

Absolute residuals

0 -0.1 0.0 0.1

Market return

A Levenes test clearly rejects constant variance in favor of a quadratic model for heteroscedasticity (see the appendix for discussion of how to identify and handle nonconstant variance that is related to a numerical predictor, rather than group membership):

c 2011, Jerey S. Simono

10

Regression Analysis The regression equation is Absolute residuals = 0.691 - 1.35 Market return + 119 Markretsquared Predictor Constant Market r Markrets S = 0.5928 Coef 0.69102 -1.353 118.77 SE Coef 0.07033 2.322 34.53 T 9.83 -0.58 3.44 P 0.000 0.562 0.001

R-Sq = 12.7%

R-Sq(adj) = 10.7%

Analysis of Variance Source Regression Error Total DF 2 86 88 SS 4.3986 30.2192 34.6178 MS 2.1993 0.3514 F 6.26 P 0.003

Here is a regression to estimate the weights for a WLS t: Regression Analysis The regression equation is lgsressq = - 1.51 + 1.28 Market return + 262 Markretsquared Predictor Constant Market r Markrets S = 2.048 Coef -1.5086 1.278 261.8 SE Coef 0.2430 8.023 119.3 T -6.21 0.16 2.19 P 0.000 0.874 0.031

R-Sq = 6.6%

R-Sq(adj) = 4.4%

Analysis of Variance Source Regression Error Total DF 2 86 88 SS 25.337 360.871 386.208 MS 12.669 4.196 F 3.02 P 0.054

The following plot illustrates the quadratic t being used to estimate these weights: c 2011, Jerey S. Simono 11

Regression Plot
Y = -1.50862 + 1.27803X + 261.782X**2 R-Sq = 0.066

lgsressq

-5

-10

-0.1

0.0

0.1

Market r

Here is a WLS version of the CAPM t: Regression Analysis Weighted analysis using weights in wt The regression equation is McDonalds return = 0.00956 + 0.961 Market return Predictor Constant Market r S = 0.07457 Coef 0.009556 0.9610 SE Coef 0.004241 0.1904 T 2.25 5.05 P 0.027 0.000

R-Sq = 22.6%

R-Sq(adj) = 21.8%

Analysis of Variance Source Regression Error Total DF 1 87 88 SS 0.14168 0.48401 0.62569 MS 0.14168 0.00556 F 25.47 P 0.000

c 2011, Jerey S. Simono

12

Things have changed a bit. The estimated beta for McDonalds is now less than one (although again, not signicantly dierent from one). Note also that if this regression model was used to predict the McDonalds return from a given market return, the use of weights could change things dramatically. One would expect to nd that a prediction interval from the WLS model would be narrower than one from the OLS model for a prediction for a small (close to zero) market return month, and wider for a prediction for a large (absolute) market return month, reecting the inherent dierence in variability o the regression line in these circumstances. August 1990 is no longer an outlier, since its high variability is accounted for by a small weight (that is, the assessment of the point as an outlier has changed because our model for the underlying variability of the observation has changed). Similarly, points previously agged as potential leverage points are no longer assessed as problematic. Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Date 8811 8812 8901 8902 8903 8904 8905 8906 8907 8908 8909 8910 8911 8912 9001 9002 9003 9004 9005 9006 9007 9008 9009 9010 SRES2 -0.90899 -0.38421 -1.10109 0.67416 -0.06580 0.38974 0.47698 -0.34625 -0.67121 -1.28713 -0.19751 1.24446 1.38654 1.09424 -0.99147 0.98144 -1.66055 -0.63790 -0.17892 2.03582 0.40715 -1.33968 -0.67044 0.45649 HI2 0.0313846 0.0216368 0.0278343 0.0301746 0.0278675 0.0193488 0.0348600 0.0172601 0.0279507 0.0313948 0.0133179 0.0582867 0.0308341 0.0201590 0.0405856 0.0433563 0.0260398 0.0132034 0.0165339 0.0146603 0.0245613 0.0398550 0.0592510 0.0370008 COOK2 0.0133862 0.0016323 0.0173564 0.0070705 0.0000621 0.0014985 0.0041088 0.0010528 0.0064772 0.0268489 0.0002633 0.0479275 0.0305820 0.0123170 0.0207919 0.0218270 0.0368614 0.0027223 0.0002691 0.0308323 0.0020871 0.0372495 0.0141552 0.0040032 13

c 2011, Jerey S. Simono

25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68

9011 9012 9101 9102 9103 9104 9105 9106 9107 9108 9109 9110 9111 9112 9201 9202 9203 9204 9205 9206 9207 9208 9209 9210 9211 9212 9301 9302 9303 9304 9305 9306 9307 9308 9309 9310 9311 9312 9401 9402 9403 9404 9405 9406

0.96311 -0.15504 -1.75915 -0.07833 2.28448 0.36163 -1.17992 -1.36401 -0.12198 -1.19281 0.15737 1.10382 -0.84500 1.09992 0.76404 0.09159 -1.20053 -0.05963 0.83955 0.92534 -0.39286 -1.54221 0.86912 -0.66391 1.25946 0.39922 -0.52347 -0.50161 0.51414 -2.34118 -0.24256 0.04283 -0.88904 1.71500 -0.35750 -0.41817 0.51192 0.19265 -0.96648 1.16619 0.56110 -0.34721 0.96668 -0.28770

0.0138295 0.0285165 0.0132096 0.0096077 0.0132046 0.0132861 0.0132046 0.0159271 0.0196353 0.0139605 0.0169432 0.0175263 0.0136596 0.0424880 0.0201325 0.0132150 0.0155118 0.0149358 0.0184774 0.0253776 0.0176171 0.0226415 0.0154108 0.0277551 0.0222654 0.0183982 0.0133812 0.0246045 0.0178250 0.0132102 0.0143164 0.0156088 0.0134404 0.0203083 0.0136898 0.0134847 0.0137214 0.0143000 0.0207773 0.0143037 0.0412568 0.0599191 0.0153696 0.0132056

0.0065039 0.0003528 0.0207128 0.0000298 0.0349175 0.0008805 0.0093147 0.0150562 0.0001490 0.0100721 0.0002134 0.0108677 0.0049442 0.0268421 0.0059970 0.0000562 0.0113544 0.0000270 0.0066344 0.0111478 0.0013839 0.0275492 0.0059115 0.0062915 0.0180612 0.0014936 0.0018582 0.0031735 0.0023986 0.0366878 0.0004273 0.0000145 0.0053840 0.0304847 0.0008870 0.0011951 0.0018229 0.0002692 0.0099098 0.0098676 0.0067741 0.0038420 0.0072933 0.0005538 14

c 2011, Jerey S. Simono

69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89

9407 9408 9409 9410 9411 9412 9501 9502 9503 9504 9505 9506 9507 9508 9509 9510 9511 9512 9601 9602 9603

-1.21161 -2.65115 0.36648 0.33529 1.90440 -0.36314 -0.58862 2.60271 0.64719 -0.90140 0.26432 -0.12688 -0.92057 -0.89047 0.38746 0.05847 1.41815 -0.06134 0.35241 1.09682 -0.44263

0.0207955 0.0133585 0.0135692 0.0284856 0.0224654 0.0285548 0.0291046 0.0194413 0.0145156 0.0279175 0.0203163 0.0229809 0.0350679 0.0166545 0.0223760 0.0141909 0.0187098 0.0296836 0.0174310 0.0342752 0.0132891

0.0155879 0.0475814 0.0009238 0.0016481 0.0416743 0.0019381 0.0051932 0.0671540 0.0030847 0.0116676 0.0007244 0.0001893 0.0153991 0.0067148 0.0017181 0.0000246 0.0191727 0.0000575 0.0011016 0.0213484 0.0013193

The Levenes test is no longer signicant, which is consistent with the residual plots, which all look ne: The regression equation is absres = 0.816 - 0.61 Market return - 8.3 Marketsquared Predictor Constant Market r Marketsq S = 0.6055 Coef 0.81588 -0.607 -8.31 SE Coef 0.07184 2.371 35.27 T 11.36 -0.26 -0.24 P 0.000 0.799 0.814

R-Sq = 0.2%

R-Sq(adj) = 0.0%

Analysis of Variance Source Regression Residual Error Total DF 2 86 88 SS 0.0729 31.5284 31.6013 MS 0.0365 0.3666 F 0.10 P 0.905

c 2011, Jerey S. Simono

15

3 2 1

SRES2

0 -1 -2 -3 -0.1 0.0 0.1

Market return

absres
1 0 -0.1 0.0 0.1

Market return

c 2011, Jerey S. Simono

16

Residuals Versus the Order of the Data


(response is McDonald)
3

Standardized Residual

-1

-2

-3 10 20 30 40 50 60 70 80

Observation Order

Normal Probability Plot of the Residuals


(response is McDonald)
3

Standardized Residual

-1

-2

-3 -3 -2 -1 0 1 2 3

Normal Score

A new estimate of the market risk is based on squaring the correlation between the ts from this model and the observed McDonalds returns: c 2011, Jerey S. Simono 17

Correlations (Pearson) Correlation of McDonalds return and FITS2 = 0.614

2 2 2 That is, R2 w 1 = .614 = 37.7% (the F based R measure is only Rw 2 = 22.6%, reecting

that much of the apparent market risk is driven by months with high volatility). The riskless rate, as measured by the monthly equivalent rate for the rst three month Treasury bill auction for that month, averaged .0045 over this time period; comparing the observed 1 ) gives us an estimate of how McDonalds performed compared constant term to Rf (1 to what CAPM would have predicted for it. Here this equals .009556 (.0045)(1 .9610) = .009381. That is, McDonalds outperformed its CAPM prediction by 0.9381% per month, which converts to an 11.86% annual outperformance of its CAPM prediction [(1.009381)12 = 1.1186]. The value of beta reported by investment analysts is usually rounded o to the nearest .05. It is also usually shrunk towards one because of regression to the mean (that is, analysts believe that stocks with unusually high or low betas in the past will probably be less extreme in the future). So, given our WLS estimate of 0.961, we would probably report McDonalds beta as 1.00. In fact, at this time the Value Line Investment Survey reported a beta of 1.00 for McDonalds, so were right in line with established opinion. One aw in the previous analysis is that it is dicult to assess whether the observed unexpected performance (relative to CAPM) could just be due to random uctuations; that is, is the 11.86% annual outperformance signicantly dierent from zero? Also, the 0 to Rf (1 1 ) assumes that the riskless rate is constant over the entire comparison of time period, which is not reasonable. We can correct these problems if we use a slightly dierent regression model to t CAPM one based on excess returns. Lets go back to the original formulation of the CAPM model, but represent it a little dierently: E (R) = Rf + (E [Rm ] Rf ), c 2011, Jerey S. Simono 18

E (R) Rf = (E [Rm ] Rf ). (4)

The values E (R) Rf and E [Rm ] Rf are the expected excess returns of the asset and the market, respectively, over the riskless rate; that is, they represent the returns that can be expected to be gained beyond those that come with zero risk. A regression model based on (4), Ri Rf i = 0 + 1 (Rmi Rf i ) + i , where the target and predictor values are now excess returns, provides an alternative way to estimate beta (via the slope in the model). Further, by (4), CAPM implies that 0 the expected excess return exactly equals beta times the market excess return, so is an estimate of McDonalds performance relative to its predicted CAPM performance (sometimes called ). A test of whether the observed performance is signicantly above or below the expected performance is then just the usual ttest for the constant term equaling zero. Here is an OLS regression using excess returns: Regression Analysis The regression equation is McDonalds excess rate = 0.00773 + 1.09 Market excess rate Predictor Constant Market e S = 0.04170 Coef 0.007735 1.0931 SE Coef 0.004481 0.1504 T 1.73 7.27 P 0.088 0.000

R-Sq = 37.8%

R-Sq(adj) = 37.1%

Analysis of Variance Source Regression Error Total DF 1 87 88 SS 0.091861 0.151277 0.243138 MS 0.091861 0.001739 F 52.83 P 0.000

c 2011, Jerey S. Simono

19

The estimate of beta (1.093) is similar to that from the earlier OLS t (1.089). The estimated outperformance of McDonalds from its CAPM prediction is .007735 (9.69% annualized), and it is not signicantly dierent from zero at a .05 level (p = .088). Residual plots and a Levenes test (not given here) again indicate heteroscedasticity in the square of market return, with the following estimated weights: Regression Analysis The regression equation is lgsressq = - 1.51 + 3.75 Market excess rate + 266 Markexsq Predictor Constant Market e Markexsq S = 2.088 Coef -1.5100 3.752 266.3 SE Coef 0.2446 7.761 120.0 T -6.17 0.48 2.22 P 0.000 0.630 0.029

R-Sq = 6.5%

R-Sq(adj) = 4.4%

Analysis of Variance Source Regression Error Total DF 2 86 88 SS 26.280 375.025 401.305 MS 13.140 4.361 F 3.01 P 0.054

Here is the WLS t: Regression Analysis Weighted analysis using weights in wt The regression equation is McDonalds excess rate = 0.00936 + 0.945 Market excess rate Predictor Constant Market e S = 0.07487 Coef 0.009357 0.9454 SE Coef 0.004084 0.1913 T 2.29 4.94 P 0.024 0.000

R-Sq = 21.9%

R-Sq(adj) = 21.0%

Analysis of Variance c 2011, Jerey S. Simono 20

Source Regression Error Total

DF 1 87 88

SS 0.13692 0.48765 0.62457

MS 0.13692 0.00561

F 24.43

P 0.000

The estimated beta (.945) is similar to the earlier WLS estimated beta (.961). The estimated outperformance of McDonalds compared to CAPM is .009357 (11.82% annualized), very similar to the earlier WLS estimate of .009381. Note that from this model t, however, we can establish that this outperformance is apparently signicantly dierent from zero (p = .024), something that the other model ts could not do. That is, CAPM fails for McDonalds, in the sense that McDonalds performance is signicantly better than CAPM predicts. An interesting application of WLS in the CAPM context can be found in the paper Outlier-Resistant Estimates of Beta by R.D. Martin and T.T. Simin (Financial Analysts Journal, 59(5), 56-69 [2003]). In that paper the authors use WLS to construct an estimator of beta that is resistant to the long-tailed nature of stock returns by downweighting those observations in the regression. Appendix: WLS when the error variance is related to numerical predictors We have previously discussed how nonconstant variance related to group membership can be identied using Levenes test, and handled using weighted least squares with the weights for the members of each group being the inverse of the residual variance for that group. Another way to refer to nonconstant variance related to group membership is to say that nonconstant variance is related to the values of a predictor variable, where that predictor variable happens to be categorical. It is also possible (as was the case here) that the variance of the errors is related to a (potential) predictor variable that is numerical (in this case it was eectively related to two variables, Market return and Market return2). Generalizing the Levenes test for this situation is straightforward; just construct a regression with the absolute residuals as the response and the potential numerical variable as a predictor. Note that this also can be combined with the situation with natural c 2011, Jerey S. Simono 21

subgroups by running an ANCOVA model with the absolute residuals as the response and both the grouping variable(s) and the numerical variable(s) as predictors. It is important to remember that the response variable itself should never be used as a potential predictor for nonconstant variance, since the (potential) nonconstant variance is already reected in that response. Constructing weights for WLS in this situation is more complicated. What is needed is a model for what the relationship between the variances and the numerical predictor actually looks like. An exponential/linear model for this relationship is often used, whose parameters can be estimated from the data (this model has the advantage that it can only produce positive values for the variances, which of course is consistent with the actual situation). The model for the variance of ith error is
2 var(i ) = i = 2 exp j

j zij ,

where zij is the value of the j th variance predictor for the ith case and 2 is an overall average variance of the errors. These z variables would presumably be the predictors that were used above for the Levenes test, and while they would typically be chosen from the same pool of potential predictors as those for the regression itself (what we typically call the xs), they dont have to be the same variables ( 2 could be related to a variable that isnt related to E (y ), and it could be unrelated to a variable that is). The problem with this formulation is that the j coecients are unknown, and need
2 to be estimated from the data. The key is to recognize that since i = E (2 i ), by the

model given above


2 log E (2 i ) = log + j

j zij 0 +

j zij .
j

That is, the logged expected squared errors follows a linear relationship with the z variables. This suggests that linear regression could be used to estimate the parameters, except that the expected squared errors are (of course) unknown. The trick is then to say that since the residuals are the best guesses we have for the errors, the squared residuals should be reasonable guesses for the expected squared errors, which means that the logged squared c 2011, Jerey S. Simono 22

residuals can be used as a response in a regression to estimate the s. The steps are thus as follows: (1) Create a variable that is the natural logarithm of the squares of the standardized residuals (LGSRESSQ, say). This variable can be formed in Minitab using the transformation Let LGSRESSQ = LN(SRES*SRES). (2) Perform a regression of LGSRESSQ on the variance predictor variables (the z variables), and record the tted regression coecients (dont worry about measures of t for this regression). (3) Create a weight variable for use in the weighted least squares analysis. The weights are estimates of the inverse of the variance of the errors for each observation. They have the form WT = 1/ exp(FITS1), where FITS1 is the variable with tted values from the regression in step 2. (4) Perform a weighted least squares regression, specifying WT as the weighting variable. You should redo a Levenes test to make sure that the nonconstant variance has been corrected. Remember that all plots and tests must be based on the standardized residuals, not the ordinary residuals, since the attempts to address nonconstant variance are accounted for in the standardized residuals. Just as was the case when doing WLS based on a categorical predictor, the estimated variance of the error for any member of the population is s/ WTi . The value of WTi comes from the estimated regression function in step 2 above (which is why it is a good idea to write down that function). So, for example, for the CAPM data the function that dened the weights was WT = 1/ exp(1.51 + 1.278 Market return + 261.8 Market return2 ). If a prediction for a new trading day for the McDonalds return was desired, and the market return on that day was .05 (for example), the weight associated with that day would be 1/ exp(1.51 + (1.278)(.05) + (261.8)(.052 )) = 2.207. The estimated McDonalds return on that day, found by substituting .05 for the market return into the WLS model would be .05761, while the estimated standard deviation of the c 2011, Jerey S. Simono 23

error term for that day would be s/ WT = .07457/ 2.207 = .0502, where the s value also comes from the WLS model. Note that this estimated standard deviation of the errors is larger than that from the OLS model (which was .04171), which reects that a day with a market return of .05 will have higher than average variability. A rough prediction interval for the McDonalds return on that day is thus .05761 (2)(.0502), or (.0428, .158). The exact prediction interval that comes out of Minitab requires more work. Here is the output that comes out if condence and prediction intervals are requested for a value of market return equal to .05: * WARNING * The prediction interval output assumes a weight of 1. An adjustment must be made if a weight other than 1 is used.

Predicted Values for New Observations New Obs 1

Fit 0.05761

SE Fit 0.00928

95% CI (0.03917, 0.07605)

95% PI (-0.09176, 0.20698)

Values of Predictors for New Observations New Obs 1 Market return 0.0500

Note that Minitab provides a warning that the prediction interval is incorrect. The problem is that the program assumes that the appropriate weight is equal to 1, even though we just saw that it really should be 2.207. The correction for this must be made by hand. The standard error of the tted value (used for condence intervals) is given correctly, but we need to calculate the standard error of the predicted value. This equals (Standard error of fitted value)2 + (Residual MS )/(W eight), where the Residual MS comes from the WLS t. Then, the prediction interval is
p1 P redicted value tn (Standard error of predicted value). /2

c 2011, Jerey S. Simono

24

The standard error of the predicted value in this case is

.009282 + .00556/2.207 = .051.

For n = 89 and p = 1 the appropriate critical value for a 95% interval is 1.988, giving prediction interval .05761 (1.988)(.051) = (.0438, .159), which is of course very similar to the rough prediction interval given earlier. There is another mechanism by which variances of errors in a regression model can be dierent for dierent observations, and related to a numerical variable. Say that the response variable at the level of an individual follows the usual regression model, yi = 0 + 1 xi1 + + p xpi + i , with i N (0, 2 ). Imagine, however, that the ith observed response is actually an average y i for a sample of size ni with the observed predictor values {x1i , . . . , xpi }. The model is thus y i = 0 + 1 xi1 + + p xpi + i , where 2 . V ( i ) = V (y i |{x1i , . . . , xpi }) = ni An example of this kind of situation could be as follows. Say you were interested in modeling the relationship between student test scores and (among other things) income. While it might be possible to obtain test scores at the level of individual students, it would be impossible to get incomes at that level because of privacy issues. On the other hand, average incomes at the level of census tract or school district might be available, and could be used to predict average test scores at that same level. This is just a standard heteroscedasticity model, and WLS is used to t it. In fact, this is a particularly simple case, since the weights do not need to be estimated at all; since V ( i ) = 2 /ni , the weight for the ith observation is just ni . That is, quite naturally, observations based on larger samples are weighted more heavily in estimating the regression coecients. It should be noted, however, that this sort of aggregation is not without problems. Inferences made from aggregated data about individuals are called ecological inferences, c 2011, Jerey S. Simono 25

and they can be very misleading. As is always the case, we must be aware of confounding eects of missing predictors; for example, if school districts with wealthier residents also have lower proportions of non-native English speakers, a positive slope for income could be reecting an English speaker eect, rather than an income eect. In addition, ecological inferences potentially suer from aggregation bias, whereby the information lost when aggregating (as it is clear that some information will be lost) is dierent for some individuals than for others (for example, if there is more variability in incomes in some school districts compared to others, more information is lost in those school districts), resulting in biased inferences. Minitab commands To construct a scatter plot with a lowess curve superimposed on it, enter the appropriate variables under Y variables and X variables as usual. Click on Data View, then Smoother, and click the button next to Lowess. Alternatively, a lowess curve can be superimposed on an existing plot by right clicking on the plot, and clicking Add Smoother, and then OK.

c 2011, Jerey S. Simono

26

Вам также может понравиться