Академический Документы
Профессиональный Документы
Культура Документы
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1 Some Questions in Macroeconomics 7
1.2 The Role of General Equilibrium Theory 7
1.3 Microfoundations 8
1.4 Expectations 8
1.5 Why adopt a Growth Theory Approach? 8
3 Growth Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.1 Introduction 25
3.2 Production Function 25
3.2.1 Relative Wages in US Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4
5 Life-Cycle Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.1 Benchmark Model 68
5.2 How the Benchmark Model Works 68
5.3 Analyzing a One-Time Shock 71
5.4 Can Model Allocations Be Improved? 72
5.5 Review of Marginal Conditions 75
5.6 Life-Cycle Model: One Downside to the Simple Formulation 75
5.7 Key Concepts 76
6 Business-Cycle Fluctuations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6.1 Business-Cycle Facts 77
6.2 Outlines of an Unsuccessful Theory 82
6.3 Technology Shocks and Business Cycles 85
6.3.1 A Model with Technology Shocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.4 The Keynesian View 87
6.4.1 A Simple Keynesian Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5
7 Fiscal Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
7.1 Accounting Framework 96
7.2 Present-Value Constraint 98
7.3 Fiscal Policy in the Life-Cycle Model 99
7.4 Three Ways to Finance a War 101
7.4.1 Analysis of Policy 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
7.4.2 Analysis of Policy 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
7.4.3 Analysis of Policy 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
7.4.4 Why Are Policy 1 and 3 Equivalent? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
7.5 Multipliers 102
7.5.1 Empirics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.5.2 Theoretical Models for Multipliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.6 Social Security Systems 106
7.6.1 Social Security: Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
7.6.2 Social Security: Some US Facts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
7.7 Overview 109
7.8 Key Concepts 110
1. Introduction
first microeconomics course, will be of doubtful relevance.1 Instead, general equilibrium methods
will be key. Such methods determine how all relevant variables (i.e. prices and quantities of all
goods and services) are simultaneously determined. This is one reason why macroeconomics is
difficult. The growth model that we will use in this book is an example of such a general equilibrium
model.
1.3 Microfoundations
The approach to answering these questions in this book will be almost entirely based on microe-
conomic principles. The book analyzes theoretical model economies populated by individual
consumers that maximize utility and firms that maximize profit. Like in microeconomics, con-
sumers pick the best choices that are within their budget sets and firms choose production inputs to
maximize profit. An implication of the assumption that consumers are making best choices is that
persuasive arguments for how government policy can produce welfare gains will have to be more
subtle than a layman may at first appreciate. An argument cannot be based on consumers making
suboptimal choices within their budget sets and the government simply pointing this out.
1.4 Expectations
We will also make the assumption that consumers and firms are forward looking as they formulate
their best choices. Consumers are forward looking in that they understand the possible shocks that
may impact the economy and how these shocks may impact the economic variables that matter for
them (for example wages and interest rates). This assumption is called rational expectations in the
macroeconomic literature.
The information about the future that the agents possess can be key. As we will see later on,
the response of such forward looking consumers to shocks anticipated to affect the economy in
a temporary way can differ greatly from the response to shocks of the same magnitude that are
anticipated to affect the economy permanently. Also, the assumption of forward looking agents
will be especially important in contemplating the effects of alternative government policies. The
analysis will be based on the assumption that consumers understand the effects of new policies
and make reasoned decisions based on how the world works under the new policies. The forward
looking hypothesis is potentially highly relevant for policy evaluation because new changes in
policy rules may affect the future economy in ways that are very hard to predict based only on
past experiences when this past experience does not include the type of policy rule variation being
considered.
on the centrality of growth theory. Thus, the book starts out by developing a theory of long-run
growth. The growth models used in this book are variants of the neoclassical growth model.
2. Measurement of Output and Prices
What is the output of the economy and how is it used? The US National Income and Product
Accounts (NIPA) are a conceptual framework for organizing data on the production of goods and
services and data on the incomes received by factors of production. The origins of this framework
date back to the 1930’s, when Nobel Laureate to-be Simon Kuznets was commissioned by the US
Congress to develop an accounting framework for national income. His first calculations showed
that national income had dropped more than 50 percent between 1929 and 1932.1 We now give a
brief sketch of the NIPA accounting framework.2
Definition 2.1.1 — Final Sales (or Expenditure) Method. Denoting the price of final good g
as pg and the final quantity produced of good g as yg , then measured GDP, using the Final Sales
Y = ∑ pg yg (2.1)
g
This method involves first getting a list of all the final goods. Imagine that these are numbered
so that g indicates the number of the good. The method states that we figure out the total expenditure
pg yg on final good g and then add up these expenditures across all the goods on this comprehensive
list. For this method to work we do not need to observe how much is the final goods output yg
and the price pg of good g but just the product of these which is the expenditure. In some parts of
the economy prices and physical quantities are easily determined. For example, for oil producers
output can be measured in units of barrels of oil of a given quality grade and prices are stated in
dollars per barrel. However, in the legal services sector one can observe the total expenditure on
legal services but the units in which the output of these legal services can be measured is unclear.
Definition 2.1.2 — Value Added Method. Denoting the value added by firm f as VA f , then
measured GDP, using the Value Added method, is given by:
Y = ∑ VA f (2.2)
f
The Value Added method involves creating a list of all the firms in the economy. We can index
a firm by its number f on this list. The method then computes the value added of each firm and
adds the value added of all the different firms together. In the simple case in which a firm produces
just a single good, then the dollar value of the sales of the firm could be thought of as equaling
p f y f , the product of the price of the good and the quantity produced. The term Intermediate
Goods Purchased f in the formula above stands for the value of all the intermediate goods that are
purchased by firm f .
An intermediate good is a good produced by a firm that is sold to another firm to be used in
production. Thus, an intermediate good does not leave the firm sector. A final good is a good
produced by a firm and sold directly to a household. An example of an intermediate good is the
corn which is produced by a farmer and sold to Kellogg’s to be converted into Kellogg’s Corn
Flakes. The Corn Flakes produced and sold to households counts as a final good. Any corn which
is produced by a farmer and sold directly to households is considered to be a final good. Thus,
some part of the total production of corn enters the calculation as an intermediate good and some
part enters the calculation as a final good.
Definition 2.1.3 — Factor Income Method. Measured GDP, according to the Factor Income
Method is the sum of all payments to national factors of production, depreciation and indirect
business taxes minus the net foreign factor income payments.
Y = 1+2+3 (2.3)
1 = Wages + Proprietor’s Income + Corporate Profit + Interest + Rent (2.4)
2 = Indirect Taxes - Net Foreign Factor Income (2.5)
3 = Depreciation (2.6)
Note: In national accounts, 1 is called National Income and 1 + 2 is called Net Domestic Product.
The Factor Income method is not as easily stated or as easily explained as the other methods.
However, the basic idea behind this method is not complicated. The basic idea is that all the value
2.1 Output and Income Accounting 13
of the final production of the firm sector of the economy has to be paid out to the owners of the
capital and labor (factors of production) that produce this output. Thus, instead of keeping track of
the value of final goods produced we could just count up all the incomes (factor payments) paid to
factors of production.
FACTOR PAYMENTS
FIRMS HOUSEHOLDS
FINAL GOODS
EXPENDITURES
This simple idea is often illustrated using the Plumber’s Diagram in Figure 2.1. The intuition is
that the payments made by households to firms in exchange for final goods can be viewed as one
flow of water through a closed system of pipes and, in the absence of any leakages, this amount of
water must also flow from firms to households. Households supply the labor and capital services
that firms use to produce final goods.
The next step is to try to make this idea work in practice. First, make a list of distinct types
of payments to factors of production. This would include (i) wages paid to employees, (ii) all the
income paid to sole proprietors: those who are self-employed, (iii) all the corporate profit by firms
organized as corporations, (iv) the net interest paid by firms to lenders and (v) the rent paid by firms
or individuals for using capital (e.g. using a building or using machinery for a period of time). All
of these items are contained in the National Income component of GDP.
One important leakage is due to government. Specifically, governments sometimes take away
some part of a firm’s income before the firm has had a chance to pay out this income to labor and
capital. Sales taxes in the United States are an example of this. The national accounts add this
leakage back so that both the income and expenditure approach will count the same thing. This
leakage is labeled Indirect Taxes in the Income approach. Thus, it is apparent that in practice Figure
2.1 needs to be modified to capture this important leakage.
Depreciation is another important complication that must be addressed. Here the issue is not
whether or not physical capital wears out over time or by use. Instead, the Depreciation term is
added back into the GDP formula for the factor income method simply because income accountants
end up using data on corporate profits and proprietors income that are calculated after subtracting an
accounting measure of depreciation allowed by tax laws. For example, in a stylized calculation of
corporate profits, a corporation starts with total revenue and then subtracts wages and depreciation
to get to corporate profits. The upshot is that the sum of wages and corporate profit does not
equal the corporations’ revenue. One would be missing the "depreciation" subtracted. Thus, the
14 Chapter 2. Measurement of Output and Prices
depreciation calculated by the accountants based on tax law needs to be tacked back on so that all
the revenue of the firm is accounted for in payments to owners of the factor inputs used by the firm.
This accounts for why a Depreciation term is added in term 3 of the formula for the Factor Income
method.
Y = ∑ pg yg = p1 y1 + p2 y2 + p3 y3 = 1 × 0 + 2 × 0 + 4 × 10 = 40 (2.7)
g
2.4 GDP in Simple Example Economies 15
To apply the expenditure approach it is critical to keep in mind the distinction between final
goods production, intermediate goods production and the total production of a good. In this
example, the total production of wheat is 10 units but all of this is sold to the Miller and converted
into flour. Thus, of the 10 units of total production of wheat 10 counts as intermediate goods
production and 0 counts as final goods production. Recall that the production of some amount of a
good is considered to be final goods production if it is sold directly to households. The production
of some amount of a good which is sold to another goods producer is considered to be intermediate
goods production. It is useful to look again at the Plumbing Diagram in Figure 2.1 to see this
distinction.
To apply the value added approach, we simply calculate the value added for each firm. In this
example, the Farmer, Miller and Baker are considered to be different firms. The value added of
each firm is measured in a common unit of account which is taken to be dollars in this example.
One could use a different but common unit of account, such as bread, if one wanted to do so. There
is nothing wrong with doing GDP accounting in a different unit of account.
To apply the income approach, we need to make some assumption on how the Farmer, Miller
and Baker are organized. They could be organized as sole proprietors or as firms which pay out
wages and profits. The text of this example did not provide this information. If we assume that
each is a sole proprietor then the income of each is simply value added in which case the Farmer’s
and the Miller’s income are both 10, whereas the Baker’s income is 20. This is the assumption used
above.
Example 2.2 — Highlight the Treatment of Depreciation. Consider a world with two firms
that use capital and labor to produce two types of goods. Firm 1 produces 20 dollars of a consump-
tion good using labor and capital and firm 2 produces 10 dollars of a capital good using labor. The
output, wages, profit and depreciation for these two firms are stated below.
Y = C + I + G = 20 + 10 + 0 = 30 (2.12)
Y = ∑ VA f = VA1 +VA2 = (20 − 0) + (10 − 0) = 30 (2.13)
f
the output of all firms is paid to factors of production (i.e. owners of capital and labor), then the
sum of factor payments must equal the value of this final output. To clarify this point, suppose that
the corporate accountants (or the corporate tax laws) change the rules for calculating depreciation.
Suppose to be concrete that depreciation for firm 1 in example 2 is now 10 rather than 5. GDP
computed using the income method will still be 20. The reason is that corporate profits shrink by 5
but depreciation grows by 5. GDP, as measured by the income approach, is unchanged.
Example 2.3 — Highlight GDP vs GNP. A small country produces 10 in vaction services using
capital and labor. Labor is paid 5 and capital is paid 5 in rent. The countries’ nationals do not own
any capital and the only source of labor income is based on work within the country.
What is GDP and GNP? To calculate GDP we use the factor incomes approach as there is
information on factor income payments. To calculate GNP we use the definition, discussed in
section 2.3, that GNP is GDP plus net foreign factor income.
Example 2.4 — Highlight Intermediate Goods and Productivity Growth. A farmer can
produce 1 unit of corn for each unit of labor. A restauranteur can produce 1 unit of corn pancakes
using one unit of corn and one unit of labor. In period 1 the farmer produces 10 units of corn using
10 units of labor. In period 1 the resturant produces 10 units of pancakes using 10 units of corn and
10 units of labor. The period 1 price of corn and pancakes are (pcorn , p pancakes ) = (1, 2).
In period 2 the productivity improves. The farmer can now produce 2 units of corn per unit of
labor and the restaurant can produce 2 units of pancakes for each unit of corn and unit of labor it
uses. In period 2 the farmer produces 40/3 units of corn using 20/3 units of labor. In period 2 the
restaurant produces 80/3 units of pancakes using 40/3 units of corn and 40/3 units of labor. The
total labor used in the economy is 20 units in both period 1 and 2, although its allocation across
firms differs across periods.
What is real GDP in each period using period 1 prices? The answer is determined by applying
the expenditure approach. A subscript on GDP denotes the period in which GDP is measured. Real
GDP more than doubles across periods even though the production technology for producing corn
or for producing pancakes each double.
government expenditures and federal government expeditures. A key component of state and local
government expendtures are expenditures on public education.
Table 2.2 highlights the results of applying the factor income method to US data. The sum of the
first five components is called National Income. Tax on Production and Indirect Tax are synonyms.
Earlier we used the term Indirect Tax. These taxes include property taxes, sales taxes and excise
taxes among other taxes in this category. Adding National Income and Tax on Production together
we get Net Domestic Product. Adding Net Domestic Product and Depreciation together we finally
get to GDP.4
Figure 2.2 graphs the behavior of real GDP and expenditure components over time. The vertical
scale is in log units. Log units are useful when a series grows over time. Specifically, if a series
has a constant positive growth rate then when the series is ploted in log units it will be a straight
line with a positive slope.5 The greater the slope is the larger is the growth rate of the series being
plotted. One key property of Figure 2.2 is that, aside from the Great Depression, WWII and some
large recessions, US real GDP grows at a fairly constant rate over much of the last 100 years. Thus,
an important issue in macroeconomics is what explains such a sustained positive growth rate? The
chapter on growth theory will address this issue.
Two other properties stand out in Figure 2.2. First, consumption parallels GDP so that the
long-run growth rate of aggregate consumption expenditures is similar to that of GDP. Second, it is
apparent that the year-to-year fluctuations in investment expenditures are much more volatile in
percentage terms than are the corresponding fluctuations in consumption expenditures or GDP. Thus,
a major question in business-cycle theory is what accounts for the year-to-year or quarter-to-quarter
4 In practice, there are a few additional terms in the Bureau of Economic Analysis tables for the factor incomes
approach that we will not discuss. One of these is called Statistical Discrepancy.
5 Here is the math. Assume that the variable y grows at a constant growth rate g so that y = y (1 + g)t . Take logs of
t t 0
both sides to get log yt = log(y0 (1 + g)t ) = log y0 + t log(1 + g). Thus, the slope will be log(1 + g) so that the slope is
larger the larger is the growth rate g.
18 Chapter 2. Measurement of Output and Prices
US Real GDP and Components
10000
Log Scale (2009 Dollars)
1000
100
10
1920 1940 1960 1980 2000 2020
fluctuations in GDP and why are investment fluctuations so volatile in percentage terms compared
to consumption and GDP? Thus, business-cycle theory is concerned with the patterns in the high
frequency wiggles in Figure 2.2 rather than the magnitude of the slopes of the long-run trends of
these aggregates. The chapter on business-cycle fluctuations will offer competing theories for the
fundamental sources of these fluctuations.
Figure 2.3 graphs the behavior of labor’s share of income in the US. The calculation is based
on the factor incomes method for calculating GDP highlighted in Table 2.2. Labor’s share is not
simple to calculate because some of the items (e.g. Tax on Production and Proprietor’s Income)
calculated in the BEA Tables are not clearly a payment to labor input or a payment to capital inputs.
Proprietor’s Income is all of the income of businesses organized as sole proprietors and, therefore,
reflects both the payments to the business owner’s labor and capital. Tax on Production (e.g. sales
taxes) are taxes that are paid by a firm before the firm has the chance to pay firm revenues to capital
and labor. Thus, it is unclear how to split Tax on Production between capital and labor payments.
With these issues in mind, we calculate labor’s share as the ratio of the Compensation of Employees
to the sum of all of the subcomponents of GDP listed in Table 2.2 except Tax on Production and
Proprietor’s Income. Figure 2.3 graphs the results of this calculation.
Economists have long been interested in patterns of how GDP is divided between households
and how GDP is divided between classes of factor inputs. Developing a theory for how GDP is
divided between labor and capital income (the so called functional distribution of income) is a
problem with a long history. David Ricardo viewed this issue as the "principal problem in political
economy". One might guess that a large decline in labor’s share may lead to revolution in some
countries or a large exodus of workers to other countries. Karl Marx famously made the opposite
2.6 Comparing GDP Across Countries 19
Labor's Share: US 1929‐2015
1
0.8
0.6
0.4
0.2
0
1920 1940 1960 1980 2000 2020
Labor's Share
conjecture. He thought that the “tendency of the rate of profit to fall” would lead to a crisis in
capitalism.
Figure 3 shows that in US data labor’s share of income has been fairly stable for roughly the
last 100 years and has averaged 65 percent over this period. This also implies that capital’s share of
income (i.e. 1 minus labor’s share) has averaged rougly 35 percent.
However, in the last 15 years or so labor’s share has fallen in the US. This has lead to a
substantial amount of new empirical work. Some authors have documented that the decline in
labor’s share in the last few decades has occured in many advanced economies.6 This fact would
seem to suggest that forces which are impacting all of these countries, such as technological change,
outsourcing and the integration of China into the world economy, might be natural as candidate
explanations.
129, 61-103.
20 Chapter 2. Measurement of Output and Prices
all countries. Denote the basket by quantities (x1 , x2 , ..., xn ) of the n distinct goods. The method
consists of computing the cost of the basket in each country and then taking the ratio. The Economist
magazine regularly provides PPP conversion rates based on a basket consisting only of one Big
Mac hamburger. Given this choice of a common basket, the PPP conversion rate equals the US cost
of the basket as a ratio to the Indian cost of the basket.
Method 3 is a very different comparison that is based on calculating GDP across countries using
a common set of “world relative prices”. There is a literature on how to weight country specific
relative prices to get the world relative prices for pairs of goods. We will not get into the merits of
different schemes to assign these weights. A large empirical literature in economics is based on a
longstanding research project to improve international comparisons by applying Method 3.
Definition 2.6.1 — Exchange Rate Method. Compare GDPUS to eGDPIndia , where e is the
exchange rate in units of Dollars per Indian Currency
Definition 2.6.2 — PPP Exchange Rate e∗ Method. Compare GDPUS to e∗ GDPIndia , where
e∗ = ∑g pUS India x is PPP exchange rate
g xg / ∑g pg g
∑g pg yIndia
g , where pg is the “world relative price” of good g.
Comment: You can download GDP calculated in world relative prices from the Penn World
Tables website (http://pwt.econ.upenn.edu/) . This is the standard data source that economists
use to make cross-country GDP comparisons.
If one compares GDP across countries using the Exchange Rate method then a typical finding
is that the ratio of GDP in a relatively poor country (e.g. India) to US GDP is much smaller
compared to calculating the same ratio using the World Relative Price method. The economic size
of developing countries in the world is much larger when comparisons are made using method 3.
One reason for this is that the relative price of internationally non-tradable goods (e.g. haircuts
and housing) to tradable goods (e.g. corn and computers) is much lower in poor countries and
non-tradable goods are an important component of GDP.
∑g pgt xg
CPIt =
∑g p∗g xg
2.8 Cost-of-Living Index 21
The numerator of the CPI is the cost of the basket in year t, whereas the denominator is the cost
of the same basket in the base year. Note that as defined above the CPI is equal to 1.0 in the base
year. In some textbooks, the CPI as defined above is multiplied by 100 so that the index equals
100 in the base year rather than 1. This type of index is sometimes referred to as a Laspeyres
Index.
The second price index is a “time-varying basket” index. The standard example of this type of
index is known as the GDP Deflator. This type of price index is also a weighted average of prices.
In this case the weights ygt are the quantities of the different final goods produced in year t. From a
mathematical point of view, the key difference between the two indexes is that in one the weights
do not change but in the other the weights change over time.
∑g pgt ygt
deflatort =
∑g p∗g ygt
The numerator of the GDP Deflator is nominal GDP in year t. This follows if the weights
ygt are the quantities of final good g produced in year t. The denominator is real GDP in year t.
The index thus tells one the cost of buying up current year final output in current years prices
compared to buying it up in base year prices. This type of index is sometimes referred to as a
Paasche index.
Why is it important to measure price indexes such as the CPI accurately? Here are some
standard answers:
1. The CPI is used to index social security retirement payments. Thus, any systematic bias will
be compounded year after year. If the CPI is biased upwards as a measure of the “cost of
living” in the sense that the CPI tends to grow faster than a true cost of living index, then
social security payments will effectively bear interest. This can turn out to be in aggregate
financial terms a very big deal simply because of the logic of compound interest.
2. One of the two mandates of the US Federal Reserve Bank is to keep inflation low. Therefore
biases in the price index can influence monetary policy.
3. The federal income tax code in the United States links tax brackets to the CPI. Thus, absent a
change in legislation, if the inflation rate is 10 percent then the level of income at which a
given tax rate applies is also raised by 10 percent. Any systematic bias in measured inflation
can increase or decrease real tax revenue as inflation occurs.
4. The price data collected by the Bureau of Labor Statistics in the U.S. is used to compute GDP.
Specifically, one can figure out the quantity of the different final goods that are produced
each year by dividing total expenditures on these goods by prices. If the prices are too high,
then the quantities produced, which one infers from price and expenditure data, are too low.
This can be important in computing GDP growth rates. Suppose it is the case that year by
year the calculated inflation rate in a specific good is too high in the sense of higher than true.
Then the growth rate of real GDP (using base year prices) will be too low.
x 2*
INDIFFERENCE
CURVE
BU
D
G
ET
LI
N
E
Now suppose that the people in charge of computing the consumer price index (CPI) go out and
observe the actual year 1 consumption choices (x1∗ , x2∗ ) and corresponding (base year) prices (p∗1 , p∗2 ).
These consumption choices are now assumed to be the basis for the fixed weights, discussed in the
last section, for calculating the CPI. Furthermore, now suppose that in year 2 the CPI folk observe
new prices (p1,2 , p2,2 ) that differ from base year prices. They could then compute the CPI2 for year
2 as follows:
The interesting question is then to ask whether using the CPI in this way over compensates,
under compensates or correctly compensates these retirees for changes in the cost of living. Thus,
does the CPI serve as a cost-of-living index as we defined it above? The answer is NO. Specifically,
using the CPI in this way will in theory sometimes over compensate in the sense that it gives too
much money to the retirees so that the retirees will be able to get strictly more utility in year 2 than
in year 1.
This over compensation always occurs when two conditions hold. The first condition is that
p∗1
between year 1 and year 2 there is a change in relative prices so that pp12 22
6
= p∗2 . The second condition
∗ ∗
is that the indifference curve through the original point (x1 , x2 ) is “smooth" in that there is no kink
so that the indifference curve has a unique tangent line at this point. As long as both these hold,
then the new budget line in year 2 will still run through the point (x1∗ , x2∗ ) but will cut through the old
indifference curve. The upshot is that because the new budget line cuts through the old indifference
curve there will be a better consumption choice in year 2 than the choice (x1∗ , x2∗ ) that was optimal
in year 1 at year 1 prices and incomes.7
Figure 2.5: Consumer Choice with Cost of Living Indexation and Relative Price Change
GAME TICKETS
INDIFFERENCE CURVE
IN BASE YEAR
BUDGET LINE
IN YEAR 2
x 2*
Figure 2.5 illustrates this idea. The solid budget line represents the consumption possibilities in
the base year. The dotted budget line represents the consumption possibilities in year 2 after two
things happen (1) prices change (2) the retiree’s income is adjusted using a cost of living index, so
the initial basket (x1∗ , x2∗ ) can be afforded at the new prices. The change in the budget line from the
base year to year 2 in Figure 2.5 is consistent with a larger proportional increase of the price of
game tickets compared to the price of yoga lessons. This implies a fall in the relative price of yoga
lessons. Thus, the new budget line is flatter and consuming a bit more in yoga lessons and a bit less
in game tickets would be a way to increase utility. This is illustrated by moving from point (x1∗ , x2∗ )
to any point along the segment of the dotted line in Figure 2.5 that lies above the indifference curve
attained in the base year.
7 There is nothing special about the choice of illustrating this theoretical point in the case of exactly two consumption
goods. If there are three goods, then the idea is that there is always over compensation when the plane describing the
“budget line" cuts through the indifference surface at best point (x1∗ , x2∗ , x3∗ ) when prices change. With more than three
goods the same ideas apply but visualization is difficult.
24 Chapter 2. Measurement of Output and Prices
The economics of this over compensation in using a fixed weight price index such as the CPI
as a cost-of-living index has been understood at a theoretical level for well over half a century.
There have been several literature surveys which have discussed the likely empirical magnitude
of the annual over compensation due to the “substitution effect" highlighted in this section.8 The
Congressional Budget Office (1994) suggested that the overcompensation was between 0.2 and 0.8
percent per year.9
8 See
Moulton (1996), Journal of Economic Perspectives, vol. 10, 159- 77 for a discussion of (1) details of how the
CPI is computed in the U.S., (2) plausible magnitudes of any bias and (3) what statistical agencies were doing to allow
the CPI to more accurately mimic a cost-of-living index.
9 Congressional Budget Office (1994), Is the Growth of the CPI a Biased Measure of Changes in the Cost of Living?,
Washington DC.
3. Growth Theory
3.1 Introduction
This chapter describes the basic elements of the theory of economic growth that were laid out in
the work of Robert Solow. The growth theory that practitioners of economics use to this day is
influenced by this work. Solow received the Nobel prize in economics in 1987 for two important
papers on economic growth.1 Solow’s first paper provides a model of economic growth that can
explain some of Nicholas Kaldor’s growth facts. Kaldor’s facts consist of six empirical regularities
about economic growth. Solow’s second paper provides a method to decompose observed output
growth into the growth of inputs and the growth of technology. This chapter discusses each of these
contributions. We start out by discussing some useful properties of production functions and the
theory of profit maximization when firms take prices as given.
Yt = At F(Kt , Lt )
1 SeeSolow (1956), A Contribution to the Theory of Economic Growth, Quarterly Journal of Economics, 70, 65-94
and Solow (1957), Technical Change and the Aggregate Production Function, Review of Economics and Statistics, 39,
312-20.
26 Chapter 3. Growth Theory
Yt : output at time t
Kt : capital input at time t. The capital stock is a stock of capital goods that are devoted
to production. The capital input is a flow of production services from capital that is
proportional to the stock of capital.
Lt : labor input at time t. The labor input is the flow of labor services proportional to the
number of workers Lt .
At : technology level at time t
The mathematical expression says that if both inputs are multiplied by any number λ greater
than zero, then output is also multiplied by λ .
We now discuss a key implication of production functions with constant returns to scale: When
the production function has constant returns to scale, the amount of output per unit of labor
depends exclusively on the amount of capital per unit of labor, and the level of technology. To
illustrate this, imagine that there are two countries with the same production function and the same
technological level A. Then, the “size” of the countries is irrelevant for determining which country
is richer in the sense of having a larger output per unit of labor. The only thing that is relevant is
which country has more capital per unit of labor, KL . This logic is expressed below in mathematical
terms. It follows from the definition of constant returns to scale when the scaling factor is set to
λ = L1 .
Y K L K
= AF , = AF ,1 .
L L L L
The concept of constant returns to scale describes what happens to output when all inputs are
increased by a common factor. We now describe a key property of the production function when
only one input is varied, keeping all other inputs constant.
slope can be calculated by taking the derivative of output with respect to the relevant input.
3.2 Production Function 27
denote the marginal products of capital and labor. The notation adopted here is to use a subscript K
or L to denote that one is talking about the marginal product of capital or labor.3 Marginal products
are functions as a marginal product will depend on the quantities of inputs that are available.
AFK (K, L) : marginal product of capital
AFL (K, L) : marginal product of labor
Theorem 3.2.1 The profit maximization problem consists of choosing the amount of capital and
labor inputs to maximize profit, taking the prices of output and factor input prices as given:
max AF(K, L) −W L − RK
When a firm’s inputs (K, L) maximize profit, then each input’s marginal product is equal to the
price of the factor of production:
1. AFL (K, L) = W
2. AFK (K, L) = R
The theory implies that if the firm is maximizing profit then implications 1.- 2. above hold. The
reason is that if either of these conditions did not hold, then the firm could increase profit by varying
the input. For example consider a specific quantity of inputs (K0 , L0 ) such that AFK (K0 , L0 ) > R.
The extra revenue of renting one extra unit capital is greater than the cost of doing so. Therefore, the
firm could increase profit by a positive amount AFK (K0 , L0 ) − R > 0 by simply using an additional
unit of capital.
λ > 0. Differentiation each side of this equation with respect to λ and use the Chain Rule to get Y = AFK (λ K, λ L)K +
28 Chapter 3. Growth Theory
The condition says that total output is equal to the sum of capital times the marginal product of
capital plus labor times the marginal product of labor. From the profit maximization problem, we
know that a profit-maximizing firm that takes factor prices as given will increase the use of each
production factor until the value of its marginal product equals its marginal cost. This was stated in
Theorem 3.2.1. Thus, we can replace the marginal products with the factor prices, to obtain the
equation below.
The marginal products of capital and labor, for the Cobb-Douglas, are as follows:
These marginal products are obtained by taking the derivative of the production function with
respect to K and with respect to L, respectively.7 It is easy to see how these two marginal products
move when capital or labor is varied. For example, in equation (3.3), notice that K appears in the
denominator, and that 1 − β is a positive exponent. This implies that the marginal product of capital
falls when K increases, other things equal. This is the property of diminishing marginal products.
In equation (3.4) note that the marginal product of labor increases as capital increases, other things
equal. Thus, capital and labor are complements in a Cobb-Douglas production function.
AFL (λ K, λ L)L. By setting λ = 1 one proves the assertion. Leonhard Euler was a very famous Swiss mathematician,
who lived in the 1700’s.
5 Table 2.2 from the previous chapter indicated that corporate profits in the US exceeded 10 percent of GDP in 2016.
This fact does not contradict the theory. In calculating corporate profit in the US, corporate accountants do not subtract
the implicit rental value of all the buildings and equipment owned by a corporation when calculating corporate profit.
Thus, it is not surprising that corporate (accounting) profits in the US are positive.
6 See Cobb, C. W.; Douglas, P. H. (1928). “A Theory of Production”. American Economic Review 18: 139- 165.
7 The relevant result from calculus involved in taking this derivatives is that the slope of the function y = xa is given
dy
by dx = axa−1 .
3.2 Production Function 29
We now calculate the share of output paid to capital and to labor when the production function
has the Cobb-Douglas form. We use the implication that factors are paid their marginal products
(i.e. W = AFL (K, L) and R = AFK (K, L)) and the Cobb-Douglas form Y = AK β L1−β .
So far, in the discussion about the properties of the production function, we had only considered
neutral technical change. However, one can consider a Cobb-Douglas production function with
8 See Adam Smith’s (1776), The Wealth of Nations.
30 Chapter 3. Growth Theory
either labor-augmenting or capital-augmenting technical change, and corroborate that all the
properties of constant returns to scale, decreasing marginal returns and zero profits still hold.
College Wage Premium: US 1915‐2005
0.7 3
0.6 2.5
Log Supply Ratio
Log Wage Ratio
0.5
2
0.4
1.5
0.3
1
0.2
0.1 0.5
0 0
1900 1920 1940 1960 1980 2000 2020
Year
Log college‐high school wage ratio Log supply ratio
The neoclassical theory developed in this chapter says that under competitive markets factors
of production get paid their marginal products. Thus, the ratio Wtc /Wth of the college wage to the
high school wage should, according to this theory, equal the ratio of the marginal product of college
to high school labor. The equation below highlights this implication using the assumption of an
aggregate production function Yt = At F(Kt , Lth , Ltc ) with inputs of capital Kt and high school and
college labor (Lth , Ltc ) and technology level At . As in previous sections, subscripts (c, h) next to the
production function are used to denote marginal products of college labor (c) and high school labor
(h).
9 See
“The Race between Education and Technology" by Claudia Goldin and Lawrence Katz, NBER Working Paper
12984.
10 Since the (natural log) of the wage ratio is 0.64 in 1915 according to the US data, then ln W c = 0.64 ⇒ W c = e0.64 =
Wh . Wh
1.90 or that the college wages are above high school wages by 90 percent. Recall that the symbol e = 2.7 is the base for
natural logs. The Wikipedia entry for natural log describes properties of logs.
3.2 Production Function 31
An Unsuccessful Theory
Consider the Cobb-Douglas production function below. The exponents (β , αh , αc ) can be inter-
preted as the share of output paid to capital and to high school and college labor respectively. The
term At captures neutral technological change. The second equation below computes the marginal
products and simplifies by canceling common terms in the numerator and denominator. The third
equation below takes the log of both sides of the second equation and simplifies.
Wtc λt Ltc
log = log + (ρ − 1) log where ρ − 1 < 0
Wth 1 − λt Lth
This theory suggests that variation in the log wage ratio over time is a race between two forces:
education and technology. On the one hand, increases in technology that complements college labor
(e.g. computers and information technology) increase the college premium. This is captured by the
λt
term 1−λ t
. On the other hand, increases in the relative supply of college labor is a force decreasing
c
the college premium. This is captured by the term LLth . Goldin and Katz argue that technology has
t
won this race since the 1950’s and accounts for the increase in the college premium.
Definition 3.3.1 — Kaldor’s Growth Facts. The following six facts about (modern) economic
growth were set down by Nicholas Kaldor in 1957:a
1. Output per capita grows over time.
2. Capital per capita grows over time.
3. The capital-output ratio is approximately constant over time.
4. Capital and labor’s share of output is approximately constant over time.
5. The return to capital does not have a strong trend.
6. Levels of output per capita vary widely across countries.
a See Kaldor, Nicholas (1957). "A Model of Economic Growth". The Economic Journal. 67 (268): 591–624.
A key issue is the degree to which these “facts” describe the behavior of particular countries
over particular time periods. We will only partially address this issue. For fact 1 above, we present
data constructed by Gregory Clark for England and by many authors for the UK as compiled by the
Bank of England.11
Figure 3.2 below plots GDP per capita in the UK and Net National Income (NNI) per capita in
England over several centuries. Labor productivity, measured by GDP per capita or by NNI per
capita, displays sustained growth in the UK since sometime around 1800 or so. Clark’s data shows
that there was not sustained growth in England over the period 1300-1700. Thus, for the UK or
England there is support for Kaldor’s first fact starting sometime around 1800 or slightly earlier.12
The time period starting after roughly 1800 is sometimes referred to as the period of modern
economic growth. Before this time period, average real yearly growth rates over long time periods
(e.g. centuries) for the most advanced economies are believed to be approximately zero. This is
consistent with the findings of Clark for England. Angus Maddison calculates that the average
yearly growth rate of real GDP per man hour in the UK from 1700-1780 was 0.3 percent.13 Thus,
during this period, the average yearly growth rate was quite small in comparison to growth rates for
many advanced economies over the last century.
11 See Gregory Clark, "The Macroeconomic Aggregates for England, 1209-1869" Research in Economic History,
2010. The data for the UK is taken from The Bank of England’s Three Centuries Macroeconomic Dataset Version 2.3 -
30 June 2016.
12 It is striking that Clark (2010) can construct a time series measuring net national income over this period. He uses
the factor incomes approach : NNI = Wages + LandRent + NetHouseRent + OtherCapitalIncome + IndirectTax. His
measure of Wages, in a given year, is based on the English population, measures of farm and non-farm wages per day,
and an assumption of average days of work in the year.
13 See Table 2.2 in Angus Maddison’s (1991) work “Dynamic Forces in Capitalist Development”, Oxford University
Press.
3.3 Solow Growth Model 33
GDP and Labor Productivity
5000000
500000
(Log Scale)
50000
5000
500
50
5
1200 1300 1400 1500 1600 1700 1800 1900 2000
Year
Ct + It = F(Kt , L) (3.10)
Kt+1 = Kt − δ Kt + It (3.11)
It = sF(Kt , L) where 0 < s < 1 (3.12)
The first equation says that from total output F(Kt , L) the amount Ct is consumed and the remainder,
It , is invested. The second equation says that capital in period t + 1 is equal to the capital from
period t, given by Kt , minus the amount of capital that is lost because of depreciation, given by
δ Kt , plus investment in new capital It . Parameter δ is between zero and one and is known as the
depreciation rate. One interpretation is that the depreciation rate is the rate at which buildings
crumble over time without repair. The third equation says that a fixed fraction of output s is invested
into new capital each period. The third equation is a behavioral rule, as it involves a decision,
whereas the first two assumptions are not about decisions.
To analyze the Solow model, combine equations 3.11 and 3.12 to obtain capital in period t + 1
in terms of capital and labor in period t.
34 Chapter 3. Growth Theory
k t +1 k t +1 = k t
k t +1 = (1 - d ) k t + sF ( k t ,1)
k* kt
Kt+1 1 Kt
= sF(Kt , L) + (1 − δ ) (3.13)
L L L
Kt+1 Kt L Kt
= sF , + (1 − δ ) (3.14)
L L L L
kt+1 = sF(kt , 1) + (1 − δ )kt (3.15)
This last equation is the key law of motion that describes the movement of capital per worker
over time in the Solow model. The equation says that capital per worker next period equals
investment per worker this period plus existing capital per worker after depreciation. Figure 3.3
describes the basic Solow model. The graph has capital per person this period, kt , on the horizontal
axis and capital per person next period on the vertical axis, kt+1 . We plot two functions in this
graph. The first is the identity function kt+1 = kt . Points along this line represent situations in
which capital per person does not change over time. The second function we plot is equation 3.15,
which describes how capital per worker evolves from one period to the next in the Solow model.
The intersection of the two curves is a level of capital per worker k∗ capital per worker that does not
change over time in the Solow model. We will refer to this situation as a steady state of the model.
It is clear that the assumption that the marginal product of capital diminishes as capital increases is
the key reason for why the law of motion for capital bends over in Figure 3.1. This implies that
there is at most one steady state value for k∗ other than zero. There will be one such value when the
marginal product goes to zero as capital gets sufficiently large.
In order to interpret the steady state, replace the steady state level of capital in the law of motion
for capital per worker, to obtain 0 = −δ k∗ + sF(k∗ , 1), rearranging, we obtain δ k∗ = sF(k∗ , 1). This
3.3 Solow Growth Model 35
equation says that, in the steady state of the simple Solow model, investment equals depreciation.
That is, the amount invested every period is just enough to compensate for the deterioration of the
stock of capital. Therefore, capital per worker does not change over time.
An interesting question is whether the economy will move toward the steady state level of
∗
capital over time. For example, suppose the stock of capital is kt = k2 . How will the stock of capital
evolve over time? This question asks about nature of the dynamics of the basic Solow model. It
turns out that, using Figure 3.1, one can conclude that the economy will move gradually toward k∗
from any starting point kt > 0 as t increases. The following theorem explains this.
Theorem 3.3.1 — Dynamics of the simple Solow Model. In the simple Solow model, if
capital per worker is positive, then capital per worker moves gradually towards the steady state
k∗ over time.
Proof. Consider any point kt > 0 on the horizontal axis. If the point is to the left of k∗ then, at
that point, the law of motion of the Solow model lies above the identity line, so one knows that
kt+1 > kt . One can also see that, because the law of motion is increasing, kt+1 < k∗ . This means
that the capital stock per person will be larger next period, but not larger than k∗ . Thus, if capital per
worker is to the left of k∗ , capital per worker will move part of the way toward k∗ by next period.
Now consider any point kt > k∗ . At any such point, the law of motion lies below the identity
line. This implies, that according to the law of motion of the basic Solow model, kt+1 < kt so that
capital per worker will be lower next period. Again, since the law of motion is increasing, kt+1 > k∗ .
Thus, if capital per worker is to the right of k∗ , capital per worker will move part of the way toward
k∗ by next period. This completes the proof.
An alternative way to answer this question is to construct a numerical example and simulate the
time path of capital per person. We do so next.
Example 3.1 — Basic Solow Model: A Numerical Example. To see how the basic Solow
model works at a mechanical level it is useful to consider a numerical example. Specifically, we
will describe a particular production function and particular values of all parameters (e.g. the
depreciation rate, the savings rate and all parameters describing the production function). Once
this is done, we will use the basic equation of the Solow model to compute how values of the
capital-labor ratio and other variables change over time. All calculations can be done with a
standard spreadsheet, with any programming language or by hand calculator.
The key equation of the Solow model which describes dynamics is given in the first equation
below. This equation is written in terms of the capital-labor ratio. To use this equation we have to
express the production function in terms of ratios to the labor input. This can be done for the Cobb-
Douglas function Y = F(K, L) = AK β L1−β simply by dividing both sides by labor L to get that
y = Akβ . The second equation below then follows from the first by substituting y = F(kt , 1) = Akβ
into the first equation. The lowercase symbol y = Y /L is output per worker or the output-labor
ratio.
36 Chapter 3. Growth Theory
Table 3.2 calculates time profiles for several periods for a number of variables in the Solow
model. Table 3.2 is based on the assumption that the initial capital-labor ratio equals 1 (i.e. k0 = 1.0)
and uses the parameters stated in Table 3.1. First, the time profile for the capital-labor ratio is
calculated using equation 3.17. All the other profiles are calculated using the profile for the capital-
labor ratio because output, investment and consumption are simple functions of the capital-labor
ratio. The final steady state quantities are also indicated at the bottom of the table. This numerical
example illustrates that the capital-labor ratio converges to the steady-state capital-labor ratio over
time.
Example 3.2 — Basic Solow Model: An Empirical Application. In the basic version of the
Solow model, we have so far assumed that labor input Lt and technology At do not change over
time. Now we will apply the basic model when the population exogenously changes over time.
Figure 3.4 plots data on the population in England from the work of Gregory Clark.14 The
English population was roughly 5 million around 1300 but decreased in several steps to around 2.5
million by 1400. According to Clark, the population was roughly 4.5 million in 1348 but was 3.5
million in 1350. If you Google “The Black Death in England” you will find a Wikipedia page that
states: “The Black Death was a pneumonic plague pandemic, which reached England in June 1348,
and died down by December 1349. It was the first and most severe manifestation of the Second
Pandemic, caused by Yersinia pestis bacteria.” We will view the fall in the population of England
as due to an exogenous shock related to the arrival of the Black Death.
Clark also calculates measures of the real wage for farm workers and for non-farm workers. His
measures for wage rates can be calculated more frequently than his measure of the population of
England. He finds that the real wage for both labor types spikes upwards when the population falls.
The percentage change in the real wage for farm work increases substantially when the population
falls and remains above the pre-1348 farm wage level. We take two main facts away from Figure
3.4. First, the English population falls after 1348. Second, average real wage rates increase in
England after 1348 compared to their previous level and particularly so for farm wages.
The key thing that we want to explain is why did wage rates move in an opposite direction
from the movement of the population? We will use the Solow model and the assumption that the
plague is an exogenous shock that kills people (L) but not physical capital (K) such as buildings,
roads, ships and farmlands. We will view it as a one-time shock for theoretical convenience that
14 See Gregory Clark, "The Macroeconomic Aggregates for England, 1209-1869" Research in Economic History,
2010.
3.3 Solow Growth Model 37
England: Black Death
0.5
0.4
0.3
0.2
0.1
0
1340 1345 1350 1355 1360 1365 1370
Time
permanently lowers the population of workers in the model and that this shock happens in one
model period.
The theoretical model implies that the capital-labor ratio k = K/L immediately increases due
to the shock. If k was initially at a steady state level k∗ before the shock - see Figure 3.3 - then k
immediately moved to the right after the shock. People were destroyed but not capital. After the
shock, the model implies that the capital-labor ratio slowly returns to the previous steady state level
k∗ absent future shocks. Of course, this implies that the new total capital stock in steady state is
now lower than before the plague.
What does the model imply for the wage rate? We assume that factors of production (labor and
capital) get paid their marginal products. Thus, the real wage rate w in the model is determined
in theory by the level of the technology and by the capital-labor ratio k. For a Cobb-Douglas
production function - see below - the implication is that the wage rate is higher immediately
after the shock than before the shock. This is simply because a relatively high capital-labor ratio
produces, technology held constant, a relatively high marginal product of labor.
K
w = FL (K, L) = (1 − β )AK β L−β = (1 − β )A( )β = (1 − β )Akβ
L
The model also implies that the wage rate will slowly decrease over time (absent future shocks)
and will return to the previous steady state level determined by the production function and k∗ . In
summary, the theory predicts that wages increase immediately after the shock and then gradually
decrease back to the steady state value absent other shocks. The model qualitatively predicts the
increase in the wage rate for farm workers that Clark documents.
38 Chapter 3. Growth Theory
Ct + It = F(Kt , Lt At ) (3.18)
Kt+1 = It + (1 − δ )Kt (3.19)
It = sF(Kt , Lt At ) (3.20)
Lt+1 = Lt (1 + n) (3.21)
At+1 = At (1 + g) (3.22)
Compared to the equations of the simple Solow growth model, these equations now have
effective labor in the production function and also have two laws of motion for labor and technology.
The growth rates of technology and population are denoted with the symbols g and n, respectively.
In the previous section, it was possible to analyze the steady state and the dynamics of the
simple Solow model by expressing the law of motion in terms of capital per unit of labor. In this
section, our goal will be to find a law of motion for capital per unit of effective labor.
The first step we take is to find the growth rate of effective labor. To do this, we combine the
laws of motion of labor and technology which appear in equations 3.21 and 3.22 above:
k t +1 k t +1 = k t
(1 - d ) k t + sF ( k t ,1)
k t +1 =
(1 + g )(1 + n)
k* kt
Definition 3.3.2 Law of Motion of the Solow Growth Model: The following equation describes
the dynamics of capital per unit of effective labor in the Solow Growth model as a function of
the saving rate s, the depreciation rate δ , population growth rate n, technological growth rate g
and the shape of the production function F:
sF(kt , 1) + (1 − δ )kt
kt+1 =
(1 + n)(1 + g)
Figure 3.5 describes the full Solow model. Note that Figure 3.2 and Figure 3.5 look very similar,
but they differ in two important ways: the kt in Figure 3.2 was defined as Kt /L whereas the kt in
Figure 3.5 is defined as Kt /(Lt At ). The second difference is that the law of motion is now divided
by (1 + g)(1 + n).
The intersection of the identity function kt+1 = kt and the law of motion determines the steady
state level of capital per unit of effective labor of the economy. Therefore, at k∗ , capital per effective
unit of labor in the economy becomes constant over time.
As in the simple Solow model, the dynamics of the model say that, if capital per unit of effective
labor is positive, it will move gradually towards k∗ over time. Thus, we can formulate a theorem.
Theorem 3.3.2 — Dynamics of the Solow Model. In the Solow model, if capital per unit of
effective labor is positive, then capital per effective unit of labor moves gradually towards the
steady state k∗ over time.
Let’s now analyze the steady state of the economy. In steady state, capital per effective unit of
labor is equal to k∗ , so that Kt /At Lt = k∗ . Therefore
Kt = k∗ At Lt
This equation shows that, in steady state, the capital stock is proportional to effective labor. For
this reason, the stock of capital Kt grows at rate (1 + g)(1 + n) − 1 in the steady state. This is the
40 Chapter 3. Growth Theory
growth rate of effective labor which is exogenous to the model. Output is given by the production
function Yt = F(Kt , Lt At ). If capital is growing according to the steady state equation then we have
Yt = F(k∗ At Lt , At Lt ). By constant returns to scale, we have that
Yt = At Lt F(k∗ , 1).
So output also grows at rate (1 + g)(1 + n) − 1 also. Immediately, we realize that investment It = sYt
also grows at rate (1 − g)(1 + n) − 1 and consumption, Ct = (1 − s)Yt also grows at this rate. This
follows because, by assumption, investment and consumption are fixed fractions of output dictated
by s and 1 − s, respectively. Thus, we have determined that all these aggregate variables grow at
constant rates in a steady state of the Solow model.
which is a constant in steady state. Since the depreciation rate is constant then so is the return
to capital r in steady state.
Note that we provided a quick and clean answer in 4 and 5 by specializing the production
function to Cobb-Douglas and using results derived previously in this chapter. It turns out that
the conclusion for property 4 and 5 hold, in steady state, regardless of whether the production
function is Cobb-Douglas. We do not to present the argument when the production function is not
Cobb-Douglas as it is somewhat technical.
The only remaining fact that the Solow model has yet to explain is Kaldor’s sixth fact. In the
Solow model, different countries can have different steady state level of output per effective unit of
3.5 Cross Country Comparisons 41
labor and different growth rates if they have (i) different saving rates s, (ii) different population
growth rates n or (iii) different growth rates of technology g. In the next section we examine how
far measured differences in saving rates can go in explaining cross country differences in output,
assuming the same technology and the same (g, n) values for all countries.
The situation in which two countries differ in saving rates can be illustrated in Figure 3.3. By
shifting the law of motion up or down. Specifically, a higher savings rate will imply higher kt+1
at every level of kt thus shifting the law of motion upwards. A similar analysis can be conducted
when the population growth rate is shifted. A different state level of capital per unit of effective
labor is prescribed. The figure is useful in ranking steady states.
Up to this point, we have focused mostly on developing the logic of how the Solow model works.
An interesting issue is whether or not broadly the model seems to make sense of data. The previous
section argued that the steady state of the model is consistent with Kaldor’s growth facts 1-5. What
is not clear is the extent to which the model is consistent with Kaldor’s sixth fact: Y /L differs widely
across countries at a point in time. The US has a level of GDP per capita that is approximately 30
times that of some very poor countries.15
While a careful quantitative analysis of the degree to which the Solow model is in agreement
with such facts is beyond the scope of this book, it may be useful to lay out some facts and some
opinions about the state of the literature. First, cross-country data does show that countries with
high measured Y /L also typically have a high measured K/L. This is good news for a theory that
requires that Y /L = F(K/L, A) and that maintains as a provisional assumption that A is common
across countries. Second, countries with high measured Y /L typically have a high measured
investment rates I/Y over long time periods. This also seems to be good news as within the Solow
model a high investment rate (i.e. a high s) in steady state is the means of attaining high K/L and
high Y /L, given the assumption that technology is common across countries.
Now one can approach this second fact from a different angle, to see if this is really good news
for the Solow model. One could ask first what are the savings or investment rates at the high and
low end of the distribution. One can find very low investment countries, like Egypt, Chad or Uganda
averaging s2 = .05 or less over several decades and very high investment countries, like China or
Singapore, averaging s1 = .30 for a few decades. One could then ask whether such differences lead
in theory to big steady state differences in Y /L or Y /LA, holding technology A, depreciation δ ,
population growth n and technological growth g constant across countries.
How much does steady state output per worker change as s changes, other things equal? To
answer this question, we first find k∗ , which is determined by the intersection of the identity line and
the law of motion in Figure 3.2. Consider the Cobb-Douglas production function y = F(k, 1) = kβ .
Plugging this function in the law of motion for capital per effective unit of labor in equation 3.26,
and imposing kt+1 = kt = k∗ we obtain:
15 Facts relating to differences in GDP per capita across countries at a point in time and across time periods are presented
in Parente and Prescott’s paper entitled “Changes in the Wealth of Nations”, Federal Reserve Bank of Minneapolis
Quarterly Review, 1994. In that work differences in GDP per capita across countries at a point in time are measured
using a common set of world prices.
42 Chapter 3. Growth Theory
Note that this ratio will not depend on ζ , only on β and the saving rates. It would depend on ζ
if n or g were different across countries. If we take a stand on the value parameter β and set β = .30,
y∗ .3
which is a ball-park number for capital’s share in the U.S., then the ratio is y1∗ = 6 .7 ≈ 2.15.
2
Recall that this is the ratio of output per effective unit of labor across countries in steady state
of the Solow model. If technology is equal across countries, then we interpret 2.15 as the ratio of
output per worker across countries that would hold in steady state when only the savings rates differ
among the exogenous inputs to the Solow model. This is a tiny ratio compared to the factor of 30
differences observed in cross-country data. Even if β = .5, then the ratio is 6. The upshot is that
steady-state differences in output implied by the Solow model and measured differences in saving
rates alone are quite small compared to output differences measured in data. Thus, we conclude
that something other than savings rates in physical capital must be very important in accounting for
observed cross-country differences.
One mantained assumption in applying the Solow model to interpret cross-country data, for
example the analysis of savings rate differences above, is that the technology level is the same across
countries. This assumption seems likely to be incorrect. We now indicate how in principle one
might go about trying to indirectly infer technological differences across countries. The framework
described below allows country i’s GDP per worker denoted Yi to be determined by the technology
Ai and the per worker input of capital Ki and the per worker quality adjusted labor Li in country i.
It is typical in this literature to use a Cobb-Douglas production function and an empirical estimate
of capital’s share β . The basic idea is then to measure (Yi , Ki , Li ) in a cross section of countries at a
point in time and then to back out technology Ai for each country i.
The literature which carries out this type of calculation is surveyed by Caselli (2005).16 A key
issue is then to have a measure of worker quality. In practice economists use data on the distribution
16 See Francesco Caselli (2005), Accounting for Cross-Country Income Differences, Handbook of Economic Growth,
Chapter 9.
3.6 Growth Accounting 43
of the workforce by experience (years worked) and by years of schooling. The idea is that in
cross-section data earnings increase with both experience and schooling and thus workers with
high experience and schooling are more productive and, hence, are of higher quality. To the degree
that rich countries have a distribution of workers with higher experience and higher schooling than
poor countries, then these are the proximate reasons providing empirical support for rich countries
having larger quality adjusted labor input Li per worker and, thus, higher output per worker.
A typical finding from this literature (see Caselli (2005)) is that rich countries (i.e. countries
with high Y ) have relatively high capital per worker K and labor quality per worker L. Thus,
variation in measured factor inputs (K, L) accounts for some of the output per worker differences
across countries. However, these measured differences do not go very far to explain all of the output
per worker Y variation across countries. Rich countries are inferred to have a higher technology
level A than poor countries based on equation 3.31 and cross-country data. Technology differences
turn out to be a quantitatively very important source of GDP differences.
Some recent work by Lagakos, Moll, Porzio and Qian (2012) argues that better measurement
of labor quality differences across countries substantially reduces the importance of technology
differences.17 They argue that differences in capital and labor quality explains approximately
two-thirds of the measured ratio of GDP per capita of the country at the 90th percentile of the
distribution compared to GDP per capita of the country at the 10th percentile. If this result proves
to be widely supported in the data, then the key question in the literature is what accounts for
such measured differences in labor quality across countries. Of course, the Solow growth model is
silent on the sources of these differences as it is not a theory of worker quality differences. The
dominant body of work that provides theory for quality differences is the literature on human capital
accumulation.18
Growth accounting is a tool for dividing up output growth into distinct sources. This tool can
be used to answer two types of questions. The first type asks what portion of observed output
growth in a country (or even a firm) over some period of time can be accounted for by changes in
technology versus the portion that can be accounted for by changes in factor inputs. The second
type of question asks what would be the effect on output growth of a change in the technology or a
change in some specific factor input, other things equal.
In questions of the first type, growth accounting tells one where growth comes from. However,
it does not tell one why the economy functions in this way. Here, the analogy with financial
accounting is apt. An accountant may be able to tell you where the income of a firm or government
comes from based on the data, but an accountant may not have any theory explaining why it is
the case that income comes from these distinct sources. To answer the latter question one needs a
theory and not merely an accounting framework.
We will now lay out the theory behind growth accounting. Solow assumed that there is an aggregate
production function Yt = At F(Kt , Lt ). Thus, aggregate output Yt is produced when the technology
level equals At and the factor inputs of capital and labor are Kt and Lt , respectively.
17 Lagakos, Moll, Porzio and Qian (2012), Experience Matters: Human Capital and Development Accounting.
18 Gary Becker received the Nobel Prize in 1992 in part for his work on human capital.
44 Chapter 3. Growth Theory
Yt = At F(Kt , Lt ) (3.32)
Ẏt = Ȧt F(Kt , Lt ) + At FK (Kt , Lt )K̇t + At FL (Kt , Lt )L̇t (3.33)
Ẏt Ȧt F(Kt , Lt ) At FK (Kt , Lt )K̇t At FL (Kt , Lt )L̇t
= + + (3.34)
Yt Yt Yt Yt
Ẏt Ȧt At FK (Kt , Lt )Kt K̇t At FL (Kt , Lt )Lt L̇t
= +( ) +( ) (3.35)
Yt At Yt Kt Yt Lt
Ẏt Ȧt K̇t L̇t
= + β + (1 − β ) (3.36)
Yt At Kt Lt
Based on this assumption we now derive the Solow growth accounting formula. Equation
3.33 differentiates the production function with respect to time and denotes a time derivative of
a variable with a dot above the variable. Thus, Ẏt denotes the time deriviative of output which
would be written in familiar calculus notation as follows: Ẏt = dY t
dt . Equation 3.33 says that the time
derivative of output Ẏt equals the effect of technological change plus the effect on output from the
change in capital plus the effect on output from the change in labor. Put slightly differently, output
changes only because technology changes or because factor inputs change over time.
It just remains to reorganize this expression in a useful way. Equation 3.34 divides by output so
that the left hand side of equation 3.34 is the output growth rate. Equation 3.35 reorganizes equation
3.34 so that the quantities in parentheses turn out to have natural interpretations in terms of data.
The quantity At FK (KYtt ,Lt )Kt is interpreted as capital’s share of output and is relabeled β in equation
3.36 whereas the quantity At FL (KYtt ,Lt )Lt is interpreted as labor’s share of output and is relabeled
(1 − β ) in equation 3.36. Notice that the numerator term in each expression is the marginal product
of capital or labor times the amount of capital or labor. In competitive thoery this amount equals
the total payment to capital and labor respectively.
Equation 3.36 is the Solow growth accounting equation. It says that output growth equals
technology growth plus the effect on output coming from the growth of capital and labor, respec-
tively. Capital growth is weighted by capital’s share of output β = At FK (KYtt ,Lt )Kt and labor growth is
At FL (Kt ,Lt )Lt
weighted by labor share of output 1 − β = Yt . In competitive theory, capital and labor’s
share sum to one with constant returns.
Definition 3.6.1 The Solow Growth Accounting Equation is stated below in two different ways:
Ẏt Ȧt
Yt = At + β K̇Ktt + (1 − β ) L̇Ltt and ∆Yt
Yt = ∆At
At + β ∆K t ∆Lt
Kt + (1 − β ) Lt
When working with data it is convenient to approximate the instantaneous growth rates in the
theory with growth rates calculated using measured variables in successive time periods. Thus, for
example, the instantaneous growth rate of output Ẏt /Yt is approximated as ∆Yt /Yt = (Yt+1 −Yt )/Yt .
It is conventional to use the capital delta symbol ∆ to indicate a change in a variable. We follow the
same convention for factor inputs by replacing instantaneous growth rates with those calculated
using data in neighboring time periods.
Solow measured the technology growth rate indirectly by backing it out of the formula in each
year. Thus, the technology growth rate was calculated as a residual (i.e. whatever was needed to
make both sides of the equation hold with equality period by period). Thus, if the theory is correct,
then the measured technological growth rate will equal the true growth rate plus a term capturing
measurement errors in the measured variables.
The main findings of Solow’s analysis are listed below and follow directly from the growth
accounting equation and US data from 1909 to 1949.
1. Output per unit of labor input grows by about 100 percent in 1909-1949.
2. The capital-labor ratio grows by about 30 percent in 1909-1949. This is sometimes called
“capital deepening”.
3. Technology grows by about 80 percent over the period. Thus, about 80 percent of the
growth in output per unit of labor input over the period is accounted for by growth in the
technology and the remaining 20 by increases in the capital-labor ratio. The overall growth
in the technology over the period was computed as follows using the measured growth rates:
At+1 = At (1 + ∆At /At ) and setting A1909 = 1.
4. The measure of the technology level falls in a number of recession and depression years and
tends to increase in expansions over the time period 1909-49. Thus, measured technology
growth rates are “procyclical”.
We will apply the growth accounting equation to US data from the Bureau of Labor Statistics
(BLS) over the period 1949-2016. The measures of real output and inputs are constructed by the
BLS for the private, non-farm business sector. The aggregate measure of labor input Lt is based on
hours of work in the private, non-farm business sector weighted by relative compensation.19 The
measure of capital input Kt is a combination of the various types of capital (e.g. land, structures,
equipment, ...). The measure of capital βt and labor’s share (1−βt ) of income used in the calculation
varies by year based on labor and capital costs.
Figure 3.6 plots annual growth rates in BLS data. Output grows at an average rate of 3.4 percent
over the period whereas labor and capital input grow at avearage rates of 1.5 and 3.9 percent per
year. Thus, the capital-labor ratio grows over this time period. The technology growth rate ∆A t
At is
backed out of the growth accounting equation each year based on the measured growth rates of
output and inputs as indicated below. The technology growth rate averages 1.1 percent per year
over the period.
measure of the year-by-year growth rate of technology implies that the technology level more than
doubles over the period.
While there are perhaps many questions that one could ask about the results in Figure 3.6, we
focus on two questions.
19 Thus, the hours of highly paid labor (e.g. doctors) are assigned a greater weight than the hours lower paid labor.
46 Chapter 3. Growth Theory
Growth Rates: Output, Labor and Technology
0.12
0.1
0.08
0.06
0.04
0.02
0
‐0.021940 1950 1960 1970 1980 1990 2000 2010 2020
‐0.04
‐0.06
‐0.08
Technology Level
2.5
1.5
0.5
0
1940 1950 1960 1970 1980 1990 2000 2010 2020
particular pattern of measurement errors. For example, imagine that the measured capital growth
rate is overestimated in recessions and underestimated in expansions. This would occur if the capital
utilization was particularly high in booms but low in recessions but was not reflected in measured
capital levels. Such errors would then be reflected by overestimating technology growth rates in
booms and underestimating technology growth rates in recessions simply because we subtract the
measured growth rate of the capital input in calculating technology growth rates. If plausible, then
some of the procyclicality of measured technology growth could be due to a particular cyclical
pattern of measurement errors.
Explanation 2: (Technology Adoption and Learning)
Suppose that new technologies come along and that old technologies are never forgotten. When
a firm adopts a new technology it could plausibly be the case that not all of the expertise the firm had
with the old technology transfers to the new technology and that after the firm switches technology
there is a period of learning about the new technology whereby expertise is gradually accumulated
in this technology. An implication is that output and measured firm productivity could fall after a
switch if the loss in expertise is sufficiently large. Moreover, in all periods after the switch measured
firm technology should increase due to the effects of learning in the new technology. Effectively,
the measured worker hours are of higher quality over time due to learning.
At a more aggregate level, such as at the industry or economy-wide level, the measured aggre-
gate technology could fall when there is some synchronization in the timing of technology adoption.
A version of this explanation has been advanced to explain the slower aggregate technology growth
rates inferred from US data in the decades immediately after 1974.20 The hypothesis was that
changes in computer technology in the 1980’s and 1990’s was the new technology being adopted.
Explanation 3: (Reallocation of Factor Inputs)
At a point in time firms within an industry differ both in age and in productivity. A typical
finding is that more productive firms of a given age have a higher rate of survival. There are
then at least two possible sources of aggregate productivity growth within an industry. First, the
productivity of newly entering firms improves over time and the average productivity among the
surviving firms tends to increase over time. Second, more productive firms of a given age are more
likely to survive and on balance more capital and labor is allocated over time to these surviving,
high-productivity firms.
This is a much richer view of the process governing aggregate productivity than the simplest
microeconomic theory (of identical firms with constant returns technologies) that is one theoretical
foundation for the use of an aggregate production function. Given this richer view, anything that
changes the process of either the survival of firms or the reallocation of capital and labor inputs to
more productive firms will impact the output produced from given aggregate quantities of labor and
capital. Thus, inferred aggregate productivity will vary for these reasons.
Changes in government policy (governing the firing of workers, governing competition, gov-
erning protection against foreign producers and so on) may be important sources of productivity
variation over long time periods. Changes in the ability of firms to borrow may be important for
productivity variation over shorter time periods. For example, a tightening of firm’s borrowing
limits or an increase in borrowing costs in recessions could slow down the process of allocating
more capital and labor to the most productive firms leading to a productivity growth slowdown.21
20 See Greenwood and Yorukoglu (1997), "1974," Carnegie-Rochester Conference Series on Public Policy, Elsevier,
vol. 46(1), pages 49-95.
21 See Khan and Thomas (2011), Credit Shocks and Aggregate Fluctuations in an Economy with Production Hetero-
geneity, NBER Working Paper 17311, for work highlighting implications of a change in firm’s ability to borrow. See
Foster, Haltiwanger and Syverson (2008), Reallocation, Firm Turnover and Efficiency: Selection on Productivity or
Profitability?, American Economic Review, 98, 394-425 for empirical work that relates aggregate productivity changes
to firm productivity changes and firm entry and exit.
48 Chapter 3. Growth Theory
Question 2: Is there a deeper sense in which, according to some theory, all output growth can
be attributed to technology growth even though the accounting framework and US 1909- 1949 data
says clearly that there is an 80/20 split?
Answer to Question 2:
The Solow growth model is an example of a theory that predicts that there is no long-run
growth in output per unit of labor input or in the capital-labor ratio unless there is technological
progress. This model is consistent with the main findings of applying Solow growth accounting to
US data. Specifically, the growth model predicts that when technology growth is positive that (i)
the capital-labor ratio grows and that (ii) the output-labor ratio grows. Thus, growth accounting
performed on data generated from the Solow growth model would calculate that part of output
growth is due to technology and another part is due to increases in the capital-labor ratio. However,
it is key to point out that the only reason why the capital-labor ratio grows in this theory is because
technology grows! Thus, technology growth induces growth in the capital-labor ratio within this
theory.
The table shows that the growth in GDP ranges from a low of 7.3 percent for Hong Kong to
a high of 10.4 percent for South Korea. The table also shows that capital growth exceeds output
growth in each country over this time period. In each country, except Hong Kong, this occurs
as the investment-GDP ratio increases over the time period. The labor growth rate is greater in
each country than the growth rate of what Young terms “raw labor”, which is a growth rate of
unweighted labor hours. The constructed series for labor effectively weights the hours of more
skilled labor more highly. Since the education levels of the workforce increase markedly from the
beginning to the end of the sample period in each country, measured labor grows more rapidly than
raw labor. Another element behind the high rates of growth of labor is that labor force participation
is increasing over the period in each country.
22 The Tyranny of Numbers: Confronting the Statistical Realities of the East Asian Growth Experience, NBER working
The findings are that average technology growth, stated in the Table as TFP (Total Factor
Productivity) growth, is 2.3 percent in Hong Kong, 1.6 percent in South Korea, −0.3 percent in
Singapore and 2.4 percent in Taiwan. Young notes in his paper that although a rate of TFP growth
on the order of 2 percent is relatively high they are not that dissimilar to TFP growth rates calculated
for some developed countries in the world over a similar time period.23 For example, the growth
rate of the technology, based on BLS data on private, non-farm business sector, from Figure 3.6
averaged 1.1 percent per year for the US over the period 1949-2016.
The bottom line of this work is that the data support the claim that the bulk of GDP growth over
the time period is due to tremendous growth rates of factor inputs and are not due to tremendously
large growth rates of TFP. This finding leads economists to say that it is not surprising that the
Asian Tigers grew at high rates given their high growth rates of factor inputs. This is just what
theory with a constant returns to scale production function would predict! What is not explained in
this growth accounting exercise is why it was the case that these were the countries that chose to
invest so heavily in human and physical capital.
The second question posed at the beginning of this section was what does economic theory
predict will happen to the growth rates of output per capita in the next several decades in these
countries. Since the growth in output was not due to usually large growth rates of technology, it
seems reasonable to use Solow growth theory to answer this question. Solow growth theory predicts
that, savings rates held constant, the growth rate of output per unit of labor input will converge to
the growth rate of the technology. Taking the US to be a country which is approximately in steady
state, this particular theory predicts that future growth rates in these countries in the next several
decades will look much more like the per capita growth rate in the US economy which has averaged
about 2 percent growth per year over the last 100 years.
Question 1: In the context of growth theory, which allocations are clearly bad allocations?
productivity (TFP) growth and sometimes it is called technology growth. Whatever the label, within models based upon
an aggregate production function it captures the proportionate upward or downward shift of the production function
about the current inputs.
24 See Phelps (1961), The Golden Rule of Accumulation: A Fable for Growthmen, American Economic Review, 51,
638- 43. Edmund Phelps received the Nobel Prize in 2006 partly for his work on the Golden Rule.
50 Chapter 3. Growth Theory
best steady state is then the steady state k that gives maximum consumption. Economists call this
steady state the Golden Rule steady state.
The Golden Rule steady state is easy to describe both with a graph and with simple mathematics.
First, consider the mathematics. The problem of choosing a steady state k to maximize consumption
is written in the first line below. The first term in the maximization problem is output and the
second term is steady state investment. Thus, the difference is consumption. The solution to this
problem is written in the second line below. The second line notes that the maximum should have
the property that there is no gain (in consumption) to having a little more or a little less capital.
Thus, the derivative or slope of the first line should be precisely zero at the Golden rule capital-labor
ratio.
This situation is graphed in Figure 3.7. The Golden Rule steady state kGR occurs at the capital
level k where the distance between the production function and the steady state investment line
is greatest. Geometrically, this can be determined by shifting the steady state investment line
up vertically until the line is just tangent to the production function. The Figure highlights this
geometric description of the Golden Rule steady state. Note that the geometry amounts to the claim
that the slope of the production function equals the slope of the steady state investment line.
We are now ready to answer Question 1. The answer is that any allocation where the sequence
of capital stock always remains strictly above the Golden Rule steady state capital stock is a bad
allocation. The reason why such an allocation is bad is that one can come up with a feasible
alternative allocation that allows for comparatively more aggregate consumption in all periods.
To be concrete, assume that the economy is at a steady state above the level kGR . Then there is
a “free lunch” that can be had simply by decreasing the capital stock to the Golden Rule level and
maintaining it there forever. Clearly, this is possible since consumption at the Golden Rule is larger
than at any capital level above the Golden Rule. In summary, any steady state above the Golden
rule steady state is bad since, paradoxically, the economy suffers from having too much investment.
These equations are useful as they have simple interpretations in terms of observables. The
second equation can be interpreted as stating that the gross interest rate (i.e 1 + r ≡ 1 + Fk (k, 1) − δ )
3.7 Golden Rule 51
is less than the steady state growth rate of aggregate output (i.e (1 + g)(1 + n)).25 Both of these
quantities can be measured. Thethird equation says that aggregate payment to capital kFk (k, 1) is
less than aggregate investment k[(1 + g)(1 + n) − (1 − δ )]. Once again, each of these quantities can
be measured. The fourth equation says that aggregate net payment to capital is less than aggregate
net investment.
These interpretations were related to data in a well-known paper by Abel, Mankiw, Summers
and Zeckhauser (1989).26 They first note that relating the gross interest rate to the gross growth
rate of output is problematic. The reason that this is problematic is that there are many interest
rates and returns that can be calculated from data in actual economies. For example, one could
choose the average real interest rate on US Treasury Bills or, alternatively, the average real return
on the US stock market. The average real return on Treasury Bills and Treasury bonds are about 1
and 2 percent, respectively, and the average real return on the US stock market is about 6 percent
25 Recall that in a steady state of the Solow model output grows at a gross rate which is approximately equal to the
population growth rate plus the growth rate of the technology.
26 Abel et. al. (1989), Dynamic Efficiency: Theory and Evidence, Review of Economic Studies, Volume 56, 1-20.
52 Chapter 3. Growth Theory
over long time periods.27 One of these returns is larger than the 3 percent average growth rate of
real output in the US over long time periods and the other two are smaller. Thus, using average
returns one could conclude either that the US economy is well above the Golden rule or well below,
depending on which asset one chooses to look at!
US Data 1929! 85
0.4
Fracttion of GNP
0.3
0.2
0.1
0
1920 1930 1940 1950 1960 1970 1980 1990
Year
The problem with the second equation is evidently that the model is too simple. Treasury bills
and stock differ enormously in risk characteristics and, as a result, have different average returns.
The theory abstracts from risk, has a single real interest rate and therefore provides no help in
deciding which asset return to use and how to use it. To respond to this issue one needs a theory
that incorporates risk. While this type of analysis is done in the literature it is too advanced for a
useful discussion at the level of this book.
Abel et. al. (1989) argue that the third and fourth equation above can be related to data in a
manner which does not lead to ambiguity. Following the discussion above, they compute the gross
payment to capital and the gross investment in the US as a ratio to GNP. These are empirical proxies
for the underlying theoretical concepts in the third equation above. Some of the empirical results of
their paper for the payment to capital and investment as a ratio to GNP are contained in Figure 3.8.
They find that the gross payment to capital is always well above gross investment in the US.
Their measure of gross payment to capital varies from a low of about 23 percent in 1945 to a
high of 32 percent in 1929. By comparison, gross investment varies from a low of 1.9 percent
in the Great Depression to a high of 19 percent in 1950. Thus, investment is always below the
payment to capital. This pattern also holds for a number of European countries plus Japan. Based
27 See Jeremy Siegel (2002, Table 1.1 and 1.2) "Stocks for the Long Run" Third Edition, McGraw Hill.
3.8 Key Concepts 53
on this evidence, Abel et. al. (1989) conclude that the advanced economies appear to all be below
the Golden Rule. Thus, there appears to be no free lunch to be had from growth theory. Stated
differently, the advanced economies of the world may have many problems but one problem that
they do not suffer from is having accumulated too much physical capital.
We review some of the main points of consumer theory which are relevant to decision making over
the life cycle. Much of this material should be familiar from an introductory or intermediate level
microeconomics course. This chapter assumes that the student has a basic knowledge of the theory
of preferences, utility and consumer demand from previous course work.
First, the two-good problem from standard static consumer theory is presented. Second, the
two-period problem in dynamic consumer theory is presented. The main result is that static and
dynamic theory are the same after reinterpreting prices and income. Third, we show how to extend
the dynamic consumer theory framework to arbitrarily many periods. The multi-period framework
is helpful for interpreting data on yearly consumption expenditures and follows without too much
effort from the analysis of the two-good problem.
Figure 4.1:
LEMONADE
p1
Slope = -
W p2
p2
a c2
Slope = -
1 - a c1
c2
INDIFFERENCE
CURVE
BU
D
G
ET
LI
N
E
c1 W SUNSCREEN
p1
A solution (c1 , c2 ) to this problem satisfies two conditions: (i) the slope of the indifference
curve (which economists call the marginal rate of substitution (MRS)) at (c1 , c2 ) equals the slope
of the budget line and (ii) (c1 , c2 ) lies on the budget line so that the consumer spends all of his/her
income (denoted W rather than I in Figure 4.1). These two conditions are stated below.
p1
MRS(c1 , c2 ) = − (4.1)
p2
p1 c1 + p2 c2 = I (4.2)
If one were given a specific utility function U(c1 , c2 ), then one could calculate the marginal
rate of substitution function MRS(c1 , c2 ) implied by U and proceed to solve these two equations.
The result would be a system of two Marshallian demand equations stating the best choices of
each good as functions of prices (p1 , p2 ), income I and parameters describing the utility function.
Here we use the result that MRS(c1 , c2 ) = −U1 (c1 , c2 )/U2 (c1 , c2 ), where U1 and U2 denote partial
derivatives of the utility function with respective to goods 1 and 2.2 These derivatives describe the
marginal utilities of consuming an extra unit these goods.
Consider the case where U(c1 , c2 ) = cα1 c1−α
2 or, equivalently, where U(c1 , c2 ) = α log(c1 ) +
(1 − α) log(c2 ). For this utility function the marginal rate of substitution MRS(c1 , c2 ) and the
Marshallian demand functions describing best choices are as follows:
2 I will present it for those who are interested in where it comes from.
Step 1: Define an indifference curve with the equation U(c1 , c2 ) = constant.
Step 2: We want to find out how c2 changes as c1 changes along an indifference curve. So let one variable be a
function of the other: U(c1 , c2 (c1 )) = constant.
Step 3: Differentiate using the chain rule: U1 +U2 dc2 (c1 )/dc1 = 0.
Step 4: Rearrange to get the result: dc2 (c1 )/dc1 = −U1 /U2 .
U (c ,c )
Step 5: Thus, MRS(c1 , c2 ) = − U1 (c1 ,c2 ) .
2 1 2
αc2
Step 6: When U(c1 , c2 ) = α log(c1 ) + (1 − α) log(c2 ), then MRS(c1 , c2 ) = − (1−α)c because U1 (c1 , c2 ) = α/c1 and
1
U2 (c1 , c2 ) = (1 − α)/c2 .
4.2 Dynamic Consumer Theory: Two Periods 57
I
c1 = α (4.3)
p1
I
c2 = (1 − α) (4.4)
p2
U1 (c1 , c2 ) α c11 α c2
MRS(c1 , c2 ) = − =− 1
=− (4.5)
U2 (c1 , c2 ) (1 − α) c 1 − α c1
2
The interpretation is that the consumer always spends a constant fraction of income I on each
good regardless of prices. The preference parameters α and (1 − α) determine these fractions.
Another feature of the optimal consumption is that when income increases or decreases, prices held
constant, then consumption of each good moves exactly proportionally to income. This is a special
case of all goods being “normal” goods where the income elasticity is precisely equal to 1. Lastly,
the demand curve traced out by varying the price of each good, other things equal, is downward
sloping.
Figure 4.2:
CONSUMPTION
PERIOD 2
Slope = -(1 + r )
W (1 + r )
a c2
Slope = -
1 - a c1
w2
c2
INDIFFERENCE
CURVE
w2
c1 w1 W
PERIOD 1
CONSUMPTION
The dynamic theory differs from static consumer theory in two main ways. The first is that
the interpretation of a good differs. Good 1 is now the consumption good in time period 1,
3 In the early 1900’s Irving Fisher considered the two-period problem that we analyze. In the 1950’s Modigliani and
Brumberg analyzed many period versions of the Fisher model. Modigliani later received the Nobel prize in 1985, in part,
for this work. In the 1950’s Milton Friedman considered a version of this model where labor income is risky. Friedman
received the Nobel prize in 1976 for this work. The standard theory that present-day economists use to think about
consumption and savings behavior is a result of this line of research.
58 Chapter 4. Dynamic Consumer Theory
whereas good 2 is now the consumption good in period 2. Thus, there is only one good per time
period. This difference means that we will be assuming that consumers make optimal consumption
plans over a two-period lifetime rather than just at a point in time. We will assume that there is
no uncertainty about future labor income and that the consumer correctly forecasts future labor
income. Thus, we assume that any consumer is both forward thinking and an optimizer. While
real-world consumption-savings problem are made when there is substantial risk related to future
labor earnings, we abstract from this risk.
The second difference is that the budget constraint is written differently. The budget set is
defined by the two equations below. The terms w1 and w2 refer to wages (labor earnings) received
in period 1 and 2 respectively. The term a2 is asset holding carried from period 1 into period 2 and
r is the real interest rate. As there is no risk in the model this is a risk-free real interest rate.
c1 + a2 ≤ w1 c2 ≤ a2 (1 + r) + w2
The typical graph of this dynamic utility maximization problem is presented in Figure 4.2. In
this graph the present value of labor income is denoted W . We use the convention that W = I =
w1 + w2 /(1 + r). In Figure 4.2, the consumer can consume more in period 1 than labor income in
period 1 (i.e. c1 > w1 ) if he/she wants to do so. The maximum possible period 1 consumption is
w2
achieved by borrowing against period 2 labor income. The maximum that can be borrowed is 1+r
as then the consumer could use all of period 2 labor income to pay back this borrowing and the
w2
associated interest costs. Sometimes economists say that 1+r is the present value of w2 units of the
consumption good tomorrow in terms of units of the consumption good today. The consumer also
has the possibility of consuming nothing today but consuming a quantity w2 + w1 (1 + r) tomorrow.
This is achieved by saving all of the labor income w1 today and converting this into w1 (1 + r) goods
tomorrow by purchasing the financial asset that pays a risk-free real interest r.
Figure 4.2 is qualitatively the same as Figure 4.1 in that it has the same geometry. Thus, at
a mathematical level both static and dynamic consumer theory must be the same. The reason
why the graph for the static and dynamic consumer theory problems are the same is that the
two budget restrictions from dynamic consumer theory are equivalent to the budget constraint
p1 c1 + p2 c2 ≤ I. To see this mathematically, just add together each of the two period budget
1
constraints after multiplying the constraint of period 2 by 1+r . This multiplication effectively
brings the quantities of period 2 to present (or period 1) value. The asset term then drops out. These
two equations are listed in the definition below.
c1 + a2 ≤ w1
1 1
c2 ≤ (w2 + a2 (1 + r))
1+r 1+r
The budget constraint of the dynamic utility maximization problem can be seen as a reinterpre-
tation of the static utility maximization problem, where we reinterpret the two consumption goods
as consumption of a single good in periods 1 and 2 and we reinterpret wealth as the present value
1 w2
of income flows. To see this formally set prices as p1 = 1, p2 = 1+r and income as I = w1 + 1+r .
Each of these three terms has a simple interpretation. p1 and p2 are the present value prices of
4.3 Dynamic Consumer Theory: Many Periods 59
consumption in periods 1 and 2 stated in units of the period 1 consumption good. These prices state
how many time 1 goods are needed to purchase one unit of the time t = 1, 2 good.
Present value is a familiar concept from introductory economics and finance courses. In the
dynamic utility maximization problem, the present value of future labor income, for example, can
be interpreted as the maximum amount of borrowing that could be fully repaid (with interest) over
the life of the consumer by using all future labor income to pay off this borrowing. I is the present
value of current and future labor income. It equals current labor income w1 plus the present value
of future labor income w2 .
We can write the solution to the dynamic consumer problem, using the corresponding solution
to the static consumer theory problem. For example, when the utility function is U(c1 , c2 ) = cα1 c1−α
2
or is U(c1 , c2 ) = α log(c1 ) + (1 − α) log(c2 ) then the solution to the utility maximization problem
is given below. We also write down the optimal savings decision a2 which is simply labor income
less consumption. Note that the optimal savings decision is backed out from the budget constraint
once one knows optimal consumption behavior. Thus, optimal savings behavior is determined from
optimal consumption behavior and the nature of budget constraints.
Theorem 4.2.1 If a consumer solves the two period utility maximization problem of choosing c1 ,
and c2 to maximize U(c1 , c2 ) = α log(c1 ) + (1 − α) log(c2 ) subject to the present-value budget
constraint, then the consumer’s behavior is as follows:
I
c1 = α = αI
p1
I I
c2 = (1 − α) = (1 − α) 1
p2 1+r
w2
I = w1 +
1+r
a2 = w 1 − c1
The decision rules say that the consumer will spend fractions α and 1 − α of present value of
w2
income I in period 1 and period 2 consumption, respectively. I = w1 + 1+r is the relevant notion
of income in the two-period model. The price p1 = 1 because the value of 1 unit of the time one
good in units of the time 1 good is clearly 1. The price of the time 2 good in units of the time 1
good is p2 = 1/(1 + r) .
Proof. First, the consumption plan is in the budget set. We know this because calculating the
present value of the consumption plan gives I. This means that the plan is affordable and that no
α c2
resources are “thrown away”. Second the slope of the indifference curve, which is − 1−α c1 equals
the slope of the budget line, which is −(1 + r) when the consumption plan (c1 , c2 ) is dictated by the
solution above. Any other consumption plan that has a present value of I will have a marginal rate
of substitution not equal to the slope of the budget line and, thus, total utility could be increased by
a small movement along the budget line for any other such plan. Thus, the only candidate solution
that has not been ruled out is the proposed solution.
According to this functional form, the overall utility of a consumption profile c1 , c2 , c3 , ..., cn is
a weighted sum of period utilities derived in each period. The parameter α j is the weight of the
utility u(c j ) derived by the consumer in period j in the consumer’s overall utility. The weights α j
are numbers between zero and one that add up to one. These weights help put the utils from each
period j in terms of utils of period 1. A common assumption is that the weights decline with j and
reflect that the consumer is impatient. The second line says that the utility derived from c j in a
given period is given by the log of c j . In other words, the period utility function is chosen to be
u(c j ) = log(c j ).
The budget constraint is described by n inequalities. Each inequality describes the budget
constraint in a model period. Each of the period constraints says that the resources devoted to
consumption and savings (on the left hand side) cannot exceed the resources obtained from labor
income, savings from last period and the interest received on savings from last period. Notice that
savings can be negative (a j < 0). This happens when the agent takes out a loan. These inequalities
are provided below. Note that the first and last period differ in that by assumption the agent is born
in period 1 with no assets and in that in the last period the agent is not allowed to take out new
loans and undertakes no savings.
c1 + a2 ≤ w1
c2 + a3 ≤ w2 + a2 (1 + r)
c3 + a4 ≤ w3 + a3 (1 + r)
...
cn ≤ wn + an (1 + r)
We can apply the same algebra step used in the two-period case to convert these n period-
by-period budget constraints into a single present value budget constraint. This can be done by
multiplying the period j constraint by 1/(1 + r) j−1 (the present value price of period j consumption
in terms of period 1 goods) and then adding all the constraints together. The result is the present
value budget constraint below. Here, as in the two period case, we are abstracting from taxes and
transfers (from government and family) and any initial financial wealth. Government taxes and
transfers can easily be analyzed by interpreting w j to be the net labor income received in period j
after government taxes and transfers.
Definition 4.3.1 The Present-Value Budget Constraint of the n-period utility maximization
problem states that the present value of the consumption flows of each of the n periods cannot
exceed the present value of the income flows received in each of the n periods.
c2 cn w2 wn
c1 + + ... + ≤ w1 + + ... +
1+r (1 + r)n−1 1+r (1 + r)n−1
When the utility function is specialized to be U(c1 , c2 , ..., cn ) = α1 log(c1 ) + α2 log(c2 ) + ... +
αn log(cn ), then we can write down the demand functions describing best choices lieing on the
budget constraint.
Theorem 4.3.1 If a consumer solves the n-period utility maximization problem of choosing
c1 , c2 , ...cn , to maximize U(c1 , c2 , ..., cn ) = α1 log(c1 ) + α2 log(c2 ) + ... + αn log(cn ) subject to
the present-value budget constraint, then the consumer’s behavior is characterized by the follow-
4.3 Dynamic Consumer Theory: Many Periods 61
c1 = α1 I
...
I
cj = αj 1
(1+r) j−1
...
I
cn = αn 1
(1+r)n−1
The decision rules say that the consumer will spend a fraction α j of income I on period j
w2 wn
consumption, where income is I = w1 + 1+r + ... + (1+r) n−1 . The asset holding behavior is
determined residually from the period by period budget constraints as follows: a j+1 = a j (1 +
r) + w j − c j given a starting value a1 = 0 and given consumption.
Although we do not provide a proof, the logic of Theorem 4.3.1 is similar to that of Theorem
4.2.1. First note that, according to the optimal plan, the present value of consumption over
the life time is equal to I. This implies that the consumption plan is affordable and that no
resources are “thrown away”. Second one can verify that, at this proposed solution, across
any two neighboring periods the marginal rate of substitution (slope of indifference curve) is
α j c j+1
MRS(c j , c j+1 ) = −U j (c1 , ..., cn )/U j+1 (c1 , ..., cn ) = − α j+1 c j and equals −(1 + r). Thus, the rate
at which the agent is indifferent to trading these goods is exactly the rate at which the financial
market allows him or her to trade these goods.
At this stage it is helpful to try to develop a graphical sense of the consumption and savings behavior
that this model is capable of producing. Here we will consider a case which is simple. To be
specific, let the agent live for n = 60 model period. Think of these as covering ages of 21 to 80 in
real life. Assume that labor income w j is equal to 40 thousand dollars before retirement and zero
afterwards. Assume that consumers retire at age 62 (that is, on period j = 41). Let the real interest
rate be zero (i.e. r = 0) each model period and let all the weights on period utility α j = 1/n = 1/60
be the same each period.
Under these assumptions, the demand functions above imply that consumption is “flat” over the
lifetime in the sense that it is the same each model period. This follows mathematically from the
demand equations above. Specifically, the denominator is the same in each demand function when
the interest rate is zero and the numerator is the same in each equation. This combination of interest
rate and preference parameter assumptions produces a desire for a smooth or a flat consumption
profile over the lifetime.
It is clear that increasing the interest rate, holding preference parameters constant, will lead to
an upward sloping profile of consumption over the lifetime as future consumption is now cheaper
in present value terms. Thus, the growth rate of consumption would be positive across neighboring
periods. We can calculate how responsive the growth rate of consumption is to a change in the real
interest rate using the equations above. It would be useful to express this in terms of the percentage
increase in the consumption growth rate to a percentage change in the interest rate. A quick bit of
algebra indicates that a one percent increase in the gross interest rate (1 + r) leads to a one percent
62 Chapter 4. Dynamic Consumer Theory
increase in the (gross) growth rate of consumption (c j+1 /c j ).4 This holds when the period utility
function is u(c j ) = log c j .
were central in work in the 1950’s by Friedman, Modigliani, Kuznets and others.
6 The figure is taken from Attansio and Weber (2010), Consumption and Saving: Models of Intertemporal Allocation
and Their Implications for Public Policy, Journal of Economic Literature, vol. 48, pages 693-751.
4.4 Some Uses Of The Model 63
20 40 60 80 20 40 60 80
Age of Head
Income Consumption
Graphs by educ
for choosing how much to consume each period. Instead, what matters is the present value of
income I, the preference parameters α j , the market interest rate r and nothing else. It is significant
that the theory uses the assumption that consumers can borrow against future income.
Consider a special case of values of the preference parameters in relation to the interest rate.
Suppose that the preference parameter α j governing the importance of period j consumption are
proportional to the present value price of period j consumption α j = γ/(1 + r) j−1 for some positive
value γ. If so, then the equation for optimal consumption behavior c j = α j × I × (1 + r) j−1 implies
that consumption profiles over the life cycle are “flat” or constant.
This does not work either! Ah, but it does tells us something that does work. If the preference
parameter α j is hump shaped over the lifetime in that α j is highest around age 50 and if the real
interest rate is approximately zero, then this theory produces a hump shaped consumption profile
c j = α j × I × (1 + r) j−1 . This is not a very impressive explanation as one could explain any age
pattern in the data by an appeal to a similar pattern in the (unobserved) preference parameters.
It is also not the full explanation that present-day economists offer. It does however explain the
observations. If one wanted to test the theory, then one would need some observations to estimate
preference parameters and some additional observations that could be used to test the theory.
2. Situation 2: (Temporary Increase) Labor income is 20 in the first period (i.e. w1 = 20) but
equals 10 in all other periods (i.e. w j = 10, ∀ j ≥ 2).
3. Situation 3: (Permanent Increase) Labor income is 20 in each period.
The theory just described implies that consumption is flat over the life cycle. In Situation 1
consumption is equal to 10 each period. In Situation 2 consumption is equal to 10.2 each period. In
situation 3 consumption is again flat and equal to 20 in all periods. The numbers for consumption
are derived by plugging in the data for labor income, the interest rate and the preference parameters
into the function describing best choices that was presented earlier. The function describing the
best choice in period j is restated below:
Let us use the simple model from the last subsection to get some understanding of how the
model might imply this behavior. Assume that the utility function parameters and that interest rates
are such that optimal consumption profiles are approximately flat. Assume also that labor income
is strongly hump shaped as it is in U.S. data. A stylized graph of this situation is given in Figure
4.4. Note that asset holdings must be consistent with the equation: a j+1 = w j + a j (1 + r) − c j . This
effectively implies that hump shaped labor income w j and flat consumption c j imply that asset
holding has a hump.
The upshot, as far as savings rates are concerned, is three-fold: (1) savings rates are low early
in life as income is temporarily low, (2) savings rates are high in middle of the life cycle as income
is temporarily high and (3) savings rates are low late in life as income is low. This explanation
highlights that low income households dissave and they are typically young and old agents, whereas
high income households typically save as they are middle age agents experiencing high earnings
and anticipate low earnings later in life.8
8 Milton Friedman suggested a complementary mechanism to produce the savings rate observations: high income
households have a high fraction with a positive-but-temporary earnings shock, whereas low income households have a
high fraction with negative-but-temporary earnings shock. Optimal smoothing behavior dictates saving part of a positive
66 Chapter 4. Dynamic Consumer Theory
4.5 Overview
This chapter uses standard consumer theory from introductory microeconomics to develop a theory
of consumption and savings decisions over the lifetime. The theory takes the view that consumers
have perfect foresight over future labor market earnings and future interest rates and make best
lifetime plans. The perfect foresight assumption may not seem very realistic, but it does extend
standard consumer theory to apply to lifetime choices. The dominant framework for thinking
about consumption and savings decisions used by current-day economists is a much more elaborate
version of this simple theory derived from the work of Fisher, Modigliani and Friedman among
others.
Perhaps the most critical way that modern theory differs from the simple theory given in this
chapter is that modern theory allows for labor income risk, both due to economy-wide risk as
well as individual-specific risk, and for some market imperfections. The market imperfections
considered by modern theory include the lack of some financial markets that could be used to insure
or hedge important components of labor income risk as well as borrowing constraints that impede
the possibility of using future income to finance current consumption.
This section describes a version of the Solow growth model but with optimizing consumers. We
will call this model the “life-cycle model”. This label stresses the fact that the consumers that live
in this model economy pass through life-cycle stages in that they are born and will later die. This
will be relevant for the issue of which consumers in the model will want to buy physical capital and
which ones will want to sell physical capital.
The production technology in the life-cycle model is the same as in the Solow model. Thus,
output is produced by a constant returns technology using capital and labor inputs. Furthermore,
marginal products are diminishing. This means that many of the insights from growth theory will
hold within the life-cycle model. For example, three things will hold: (i) steady-state growth will
occur only if there is growth in technology, (ii) output growth will equal the sum of technology
growth and weighted input growth and (iii) there is the possibility that too much capital may be
accumulated within the model so that the model economy may potentially have a capital-labor ratio
greater than the Golden rule capital-labor ratio.
The life-cycle model assumes that consumers maximize utility as is standard in microeconomics.
Adding optimizing consumers is important in a number of ways. First, and most obviously, the
savings rate of the economy will respond to policy changes and to shocks impacting the economy.
This was not true in the Solow model as the savings rate was exogenous. If one is sympathetic to
the view that people respond to incentives, then adding optimizing consumers is certainly consistent
with such a view. Second, one can ask whether or not the functioning of the model economy can
be improved. Economists find it natural to ask whether improvements in the model economy are
possible in the Pareto sense (i.e. is there an alternative feasible allocation that increases at least one
person’s utility without lowering anyone’s utility). If the individual’s populating a theoretical model
have no clear preferences over things they end up choosing, then it is not so clear how to evaluate
whether the functioning of the economy can be improved. However, when the individuals do have
clear preferences, then it is natural to use these preferences and the Pareto criteria to determine if
the functioning of the economy can be improved.
The life-cycle model will be extremely simple. The simple structure of the model leads us to
use the model to gain qualitative insight but not quantitative insight. At a later stage, we will use
68 Chapter 5. Life-Cycle Model
the model to get qualitative insight into the effects of (i) a one-time increase in the population, (ii)
an increase in government war expenditures that are financed in different ways, (iii) a temporary
tax cut, (iv) the adoption of an unfunded social security system and (v) a temporary or permanent
change in the technology. The simple structure will lead to clear insights. Thus, the model provides
qualitative answers to traditional questions related to how the economy responds to shocks and to
fiscal policy changes.
Kt+1 = Kt (1 − δ ) + It
5. Accounting Framework: Aggregate consumption and investment equal GDP. GDP is
produced with aggregate capital and labor. GDP also equals labor income plus capital
income, where w and R are rental prices of labor and capital.
Ct + It = Yt = F(Kt , Lt ) = wt Lt + Rt Kt
Ct = Ncyt + Ncot
Consumers will hold all capital and they make up the supply side of the market, whereas firms
are on the demand side. It will turn out that the rental market price Rt and the real interest rate rt
on loans are closely connected. We will come to the loan market later on but it is fair to say that
this market will be special since, absent government debt, consumers will be on both supply and
demand sides.
We assume that all markets in the model are competitive. Thus, at any time t, consumers will
take the wage wt , the rental rate Rt and the real interest rate rt paid as given and beyond their control.
Each young consumer born at time t will then make a consumption-saving-labor plan over the
lifetime (cyt , cot+1 , at+1 , lt ). This plan solves the problem of maximizing utility over the lifetime,
given the budget constraint. This exact problem was studied in Theorem 4.2.1 from chapter 4. The
utility function of an agent is of the Cobb-Douglas form. Thus, optimal decisions take a simple
form and are listed below.
cyt αwt
cot+1 (1 − α)wt (1 + rt+1 )
=
at+1 (1 − α)wt
lt 1
We highlight two keys points about the consumption-saving-labor plan. First, remember from
chapter 4 that when an agent can work when young but not when old then the present value of
current and future labor income is simply wt for the young agent alive in period t. Thus, these
agents are “worth” wt . Second, the savings choice at+1 of young agents in period t represents both
the holding of physical capital and real loans. As these two different assets are both risk-free in this
model it should make intuitive sense that they should have the same real return rt+1 .
We know from the growth theory chapter that in competitive theory the wage will turn out to
equal the marginal product of labor. A competitive firm will in theory choose to hire additional
labor input up to the point where the marginal product of the last worker equals the wage. But
this marginal product is in turn determined by the production function and the supplies of total
capital and labor. The connection between wages and factor inputs are listed below. We also list the
relation between the rental rate of capital Rt , the real interest rate rt and marginal products. These
follow a similar logic that was discussed in the growth theory chapter.
β −1 β −1
Rt = FK (Kt , N) = β AKt N 1−β = β Akt
rt = FK (Kt , N) − δ
The three equations above hold at any time given the available quantities of capital Kt and labor
N. Labor supply Lt equals N as each of N young agents each work one unit of time. The capital
level Kt was determined by the savings decisions of young agents alive in period t − 1.
One thing that is missing from this account is an understanding of how the aggregate capital
stock changes over time. What determines how the aggregate capital stock moves over time? The
answer is that it all depends on savings behavior. The only agents that are going to hold savings
between periods are young agents. The reason is obvious. The young care about consumption when
old but have no source of labor earnings when old. Thus, holding assets (physical capital) is the
means to consume in old age in the benchmark model without a government. The old agents will
70 Chapter 5. Life-Cycle Model
not hold assets for an additional period as they will be dead next period and they get no joy from
bequesting assets to their children or to anyone else. Yes, the model is very stylized!
The first equation below describes how the capital stock evolves over time. Total capital holding
from period t to period t + 1 is N times the amount at+1 = (1 − α)wt of savings of each young
agent. This comes from dynamic consumer theory. Since the wage at time t is the marginal product
of labor, the first equation substitutes this in for the wage. The second equation follows from the
first by dividing each side of the first equation by labor. This is a standard and useful transformation
that was widely used in analyzing the Solow model.
β
Kt+1 = Nat+1 = N(1 − α)wt = N(1 − α)(1 − β )Akt
β
kt+1 = at+1 = (1 − α)wt = (1 − α)(1 − β )Akt
β
The equation kt+1 = at+1 = (1 − α)wt = (1 − α)(1 − β )Akt is very important in our analysis.
We will call this equation the law of motion for capital. Just as in the Solow model, once one knows
how the capital-labor ratio moves over time then one can easily figure out how all the other model
variables move over time. This is because the output-labor ratio, investment-labor ratio and factor
prices are all simple functions of how the capital-labor ratio moves over time.
Figure 5.1 graphs the law of motion. The current capital-labor ratio kt is on the horizontal axis
and the future value kt+1 is on the vertical axis. The places where the graph crosses the 45 degree
line are steady states. Steady states are capital-labor ratios which do not change over time. Thus,
5.3 Analyzing a One-Time Shock 71
when the economy is in steady state no variable will change over time unless the economy is hit by
an exogenous shock.
Figure 5.1 shows that there is a unique positive capital steady state and a trivial zero capital
steady state. The graph also implies that the economy converges to the positive capital steady state
from any positive initial value of the capital-labor ratio. This occurs as the graph lies above the 45
degree line to the left of this steady state and lies below it to the right of this steady state.
It is useful to consider a numerical example to illustrate how this model works. We use the
β
law of motion graph which simply plots the equation kt+1 = (1 − α)(1 − β )Akt . This equation
allows one to calculate how the capital-labor ratio moves over time. The key inputs are the model
parameters (A, α, δ , β ) and the initial capital-labor ratio k0 = 1.0. The numerical example sets the
technology parameters to (A, δ , β ) = (10.0, 0.1, 0.5) and the preference parameter to α = .5. In
Table 5.1 we plug in the value k0 = 1.0 into the law of motion and find that k1 = 2.5. We repeat this
procedure to produce the sequence of capital-labor ratios in Table 5.1. Once one has calculated how
the capital-labor ratio moves over time, calculating all other variables of interest is straightforward.
This is because all other variables are simple functions of the capital-labor ratio.
neither technology parameters (e.g. β and A) nor preference parameters (e.g. α) change. Thus, the
law of motion graph does not move as a result of the one-time change in the population.
Although the law of motion does not change, what does change is the capital-labor ratio. The
one-time increase in the population decreases the capital-labor ratio simply because the denominator
(labor input) grows while the capital stock at least initially stays constant. After this one-time
change occurs, then the law of motion tells us that the economy returns over time to the steady-state
level k∗ of the capital-labor ratio.
Model Predictions:
1. Capit-labor ratio: The capital-labor ratio falls at the time of the shock but afterwards
increases monotonically over time to return to the steady-state level.
2. Output-labor ratio: The output-labor ratio falls at the time of the shock but increases
monotonically over time to the steady-state level afterwards. This occurs because in any time
period yt = AF(kt , 1) and because the production function is increasing in kt . Even though
the output-labor ratio falls, total output increases over time in the model. In fact, the model
implies that over time output must increase by exactly the percentage increases of the labor
input.
3. Wage: The wage wt falls at the time of the shock but increases in each period after the shock.
This occurs because in any time period wt = AFL (kt , 1) and because the marginal product of
labor is (by the Cobb-Douglas assumption) increasing in the capital-labor ratio.
4. Investment: The model predicts a boom in total investment. We can see this directly from the
behavior of the capital-labor ratio. This ratio initially falls entirely because the denominator
(labor) increases. The denominator then stays fixed. Thus, the only way for the capital-labor
ratio to return to steady state is for the total capital stock to increase.
What Actually Happened to Israel?
There is evidence that around the time of the population change there was (1) a strong increase
in GDP growth rates and (2) an investment boom. Both of these are predicted by the life-cycle
model.
economies that are more general than the life-cycle model laid out earlier.4 It says that as long as
the real interest rate period by period is always positive, then the allocation produced by competitive
markets within the model is Pareto efficient. Thus, if this positive interest rate condition holds,
then no Pareto improvements can be made. This holds even if we add an all-powerful being to the
model. Specifically, if this all-powerful being can choose an alternative allocation, then it cannot
improve welfare in the Pareto sense while using the production technology that is available within
the model.
A1: The utility function U(cyt , cot+1 ) is increasing in both components and has a well-defined
marginal rate of substitution.
A2: The production function Ft (Kt , Lt ) is constant returns to scale for any time period t.
Furthermore, the production function together with the depreciation rate δ imply that the capital
stock and output must remain bounded.
PROPOSITION: Consider any version of the life-cycle model that satisfies assumptions A1
and A2.
If the allocation produced by such a life-cycle model with competitive markets has the property
that 1 + rt > 1 + ε for all time periods t ≥ 1 for some positive number ε > 0, then the allocation
produced by this model is Pareto efficient.
Step 4: (Snowball Grows Without Bound) One period in the future (period t + 1) it must be the
case that the (future) young give up consumption to finance the snowball effect in Step 3. If the
young agents at t + 1 are to be made no worse off, then they must in turn by compensated when old.
They give up an amount equal to (1 + rt+1 ) × ∆ when young so they must (following the logic of
theorem in most intermediate microeconomics textbooks and in Wikipedia. These treatments typically do not emphasize
the time dimension at all or do not allow for an infinite time horizon that is a feature of the life-cycle model. Economists
have understood that versions of the first welfare theorem apply to situations with both time and economic uncertainty at
least since Gerard Debreu’s “Theory of Value” published in 1959 by Yale University Press.
4 They are more general in that they allow a larger class of utility functions U and production functions F than are
used in the benchmark life-cycle model. The benchmark model uses Cobb-Douglas production and utility functions.
5 A heuristic argument is one that is suggestive but not definitive in settling an issue. An argument that proves the
Proposition can be made, but it will be less transparent for many readers than the heuristic argument.
74 Chapter 5. Life-Cycle Model
Step 3) be compensated by their marginal rate of substitution times this amount. This is given in
the equation below.
Repeating this argument, the compensation grows in each generation since the real interest rate
in each generation is positive.
Step 5: (New Allocation Is Not Feasible) The upshot of Step 4 is that the snowball effect
implies that eventually the transfer of resources from young to old will be arbitrarily large (i.e.
larger than any fixed number). This is infeasible as it was assumed that the production technology
implies that capital and output must remain bounded. Therefore, it is not feasible to make a small
gift of ∆ > 0 to the old agents without decreasing the utility of any other agents. This ends the
Heuristic argument.
It does not make long-term investment decisions in how many buildings and machines to purchase
and hold between model periods. Instead, consumers make all of these long-term investment
decisions. However, in modern economies firms often own the buildings and machines that they use.
Thus, unlike in the life-cycle model, firms in modern economies make many investment decisions.
The purpose of this subsection is to outline how the analysis would have to be modified if the
firm were to make investment decisions and what might be gained by doing so.
The objective of the firm would need to change. The new objective would be that the firm
chooses investment and labor over time so as to maximize the discounted value of "dividends" over
the infinite future for its shareholders. The "dividend" of the firm at time t would be the firm’s
output less wages and less investment: dividendt = F(Kt , Lt ) − wt Lt − (Kt+1 − Kt (1 − δ )). The
idea of discounting is that a given size of a dividend in the current period or one period in the future
have different current values when the real interest rate is positive. Real interest rates serve the role
of discount factors. This notion of discounting is the same as notions of present value from the
chapter on consumer theory.
Consumers in the economy would buy and sell shares in this firm. The owners of these shares
receive the dividend payments and, thus, these shares have value. The value of these shares would
be a key component of wealth for consumers. In theory the value of the firm’s shares would equal
the value of discounted future dividends. In this way, the value of the firm would correspond to the
notion of the stock market value of firms in modern economies. Formulating the model in this way
would allow an analysis of how shocks to the economy impact output, wages and interest rate (as
before) and how the same shocks impact the stock market value of the firm (not covered before).
Thus, the benefit of modeling firm investment decisions would be a theory of how shocks move the
stock market and the overall economy.
A sure way to start a fight amongst a group of economists is to start talking with great confidence
about what produces business-cycle fluctuations. Economists largely agree on what are some of
the main statistical features of business-cycle fluctuations. This much is good news. The big
disagreements are over what type of theory might usefully explain the type of fluctuations that
are observed. The theories discussed in this chapter offer different candidates for the fundamental
driving shocks that produce business-cycle fluctuations.
This chapter is organized in five main parts. First, business-cycle facts are presented and
discussed. Second, the outlines of a class of business-cycle theories is presented that is in conflict
with a key business-cycle fact: procyclical labor productivity. Third, a technology-driven theory
of the business cycle is presented that has a mechanism for producing output fluctuations that
feature procyclical labor productivity. Fourth, the view that business cycles are technology driven
is contrasted with some quotes from John Maynard Keynes.1 According to Keynes, one key driver
of business-cycle fluctuations are the “animal spirits” of investors. Modern authors have tried to
view Keynesian “animal spirits” as rational, self-confirming fluctuations in business confidence.
Fifth, we review the logic of Robert Lucas’ provocative calculation of the maximum potential gain
to perfectly smoothing out the business cycle.
and Money”.
78 Chapter 6. Business-Cycle Fluctuations
“... a type of fluctuation found in the aggregate economic activity of nations that organize their
work mainly in business enterprises: a cycle consists of expansions occuring at about the same time
in many economic activities, followed by similarly general recessions, contractions, and revivals
which merge into the expansion phase of the next cycle; this sequence of changes is recurrent but
not periodic; in duration business cycles vary from more than one year to ten or twelve years; they
are not divisible into shorter cycles of similar character with amplitudes approximating their own."
It is not very common any more to characterize business cycle facts in the manner of Burns
and Mitchell. Instead, there is some agreement that it would be useful to divide a time series of
a variable (e.g. the log of real GDP) into a trend component and a cycle component. Thus, for a
variable yt , one would say yt = ytrend
t + ytcycle . Figure 6.1 carries this out using US data on GDP and
using a particular way of defining the smooth trend line. The cyclical component of log GDP in a
given time period is then the vertical distance in Figure 6.1 between log GDP and the smooth trend
line for log GDP.
To determine the cyclical component of a series, one needs to make a couple of decisions. First,
it is common to take the logarithm of many variables such as GDP, consumption, investment, and
wages. This is done in Figure 6.1 where we graph the log of US GDP. Intuitively, if the components
of GDP are moving around a roughly constant trend growth rate, then taking the log will mean that
these transformed variables are moving around a “nearly” straight trend line. Also taking logs will
mean that deviations from trend can be interpreted as percentage deviations from trend.2
Second, one needs an operational definition of trend for each series. Figure 6.1 is based on
a standard method for defining a smooth trend line in a business-cycle context.3 It is clear from
Figure 6.1 that the trend line is smoother than the data. Thus, the wiggles in the data that occur at a
frequency of several years (high frequency variation) are included in the cyclical component of
GDP. It should also be intuitively clear that the smooth trend line does change slope as the average
slope over a period of a decade or more are quite different over time. Thus, the trend component
picks up some low frequency movement in output so that some of this low frequency movement in
output is not present in the cyclical component as defined by this procedure. The low frequency
movements could also be viewed as “slow moving” components.
Figure 6.2 graphs the cycle component of output ycycle as well as the cycle components of
consumption and investment. The cycle components of output (GDP) and consumption tend to
move up and down together and have a similar magnitude. The cycle component of investment is
much more volatile than output and investment tends to move up and down together with output.
Figure 6.3 graphs the cyclical component of output and the cyclical component of total labor
hours. One can see from the graph that labor hours are about as variable as output and tend to
move up and down together with output. Business cycle facts will simply involve quantitatively
2 A consequence of taking logs is that the cycle component can be viewed as the percentage deviation of the (unlogged)
variable from the trend component. To see this let lower case letters denote the log of the upper case letters so that
.
yt = log(Yt ) and note that log(1 + g) = g when g is small:
cycle .
yt ≡ yt − ytrend
t = log(Yt ) − log(Yttrend ) = log(Yt /Yttrend ) = log(1 + (Yt −Yttrend )/Yttrend ) = (Yt −Yttrend )/Yttrend
3 Givena time series of a variable (y1 , y2 , ..., yT ), the method developed by Hodrick and Prescott involves choosing
the trend terms (ytrend
1 , ytrend
2 , ..., ytrend
T ) to minimize the following expression when the smoothing parameter λ is set to
λ = 1600:
T T −1
∑ (yt − ytrend
t )2 + λ ∑ [(ytrend trend
t+1 − yt ) − (ytrend
t − ytrend
t−1 )]
2
t=1 t=2
Intuitively, minimizing the objective function implies that deviations from trend (the first term) should not be too large
and that movements in the trend itself should not be too large (the second term).
6.1 Business-Cycle Facts 79
US Log GDP 1948‐2012: Data and Trend
‐7.6
1940 1950 1960 1970 1980 1990 2000 2010 2020
‐7.8
‐8
‐8.2
Log Units
‐8.4
‐8.6
‐8.8
‐9
‐9.2
Year
Log GDP Trend
documenting how variable different series are and the degree to which different series tend to
move in the same direction as compared to the cyclical component of GDP. We will use standard
measures used in a course on statistics to state a measure of variability and a measure of how a
series moves in relation to GDP.
Table 6.1 below documents some of the properties of the series displayed in Figure 6.2 and
Figure 6.3. First, we discuss the amplitudes of the cycle components of each series. Table 6.1
documents that consumption of nondurable goods and services are less variable than GDP but that
investment is substantially more variable than GDP. We use the standard deviation from elementary
statistics as a measure of the variability or amplitude of a series. A one standard deviation movement
of GDP from trend is a 1.7 percent movement, whereas corresponding one standard deviation
movements of consumption of nondurable goods and services and investment are a 0.9 percent
movement and a 5.0 percent movement, respectively.
80 Chapter 6. Business-Cycle Fluctuations
US Business Cycles: Output Components
0.15
0.1
Business Cycle Component
0.05
0
1940 1950 1960 1970 1980 1990 2000 2010 2020
‐0.05
‐0.1
‐0.15
Year
Table 6.1 also examines the cyclical component of total labor hours in the US. Our measure
of total labor hours is approximately as volatile as GDP with a one standard deviation movement
equal to 1.5 percent. This fact is apparent from Figure 6.3. An interesting fact from Table 6.1 is
that the cyclical component of employment (the number of people working) is nearly as variable
as the cyclical component of total labor hours. Note that total labor hours equals the number of
people working times the average hours worked per employed person. Thus, Table 6.1 implies that
the bulk of the cyclical variation in total labor hours comes from people moving into and out of
employment rather than all employed people varying their work hours over the business cycle.
6.1 Business-Cycle Facts 81
US Business Cycles: Output, Labor and Productivity
0.06
0.04
Business Cycle Component
0.02
0
1940 1950 1960 1970 1980 1990 2000 2010 2020
‐0.02
‐0.04
‐0.06
‐0.08
Year
Next we discuss the correlation of the cyclical component of each series with the cyclical
component of GDP. It is true that any time series is perfectly correlated with itself. Thus, the
correlation of GDP with itself is 1.0. Both consumption and investment are procyclical in that they
display strong positive comovement with GDP. The correlation is .79 for consumption and .84 for
investment. Government spending has a small positive correlation with GDP equal to .05. This
measure of government spending includes government spending on goods and services (e.g. public
education expenditures of state and local government and federal defense expenditures) but does
not include transfer payments (e.g. social security checks).
On the factor inputs side labor hours are procyclical in that they have a strong positive correlation
of .87 with GDP. In addition, labor productivity (GDP divided by labor hours) is procyclical with a
correlation with GDP of .36. Both of these last two facts are illustrated in Figure 6.3. Although it is
not clear just now, it soon will be clear that this last finding (i.e. labor productivity is procyclical) is
quite interesting from the perspective of theoretical models that rely on an aggregate production
function. We will discuss this issue in the next section. Figure 6.3 seems to show that cyclical
component of labor productivity has started to move in the opposite direction from the cyclical
component of GDP in the last few decades.
To sum up, Table 6.1 documents some facts about aggregate fluctuations in US data. A theory
of these fluctuations should say what are the fundamental sources (shocks?) that drive these
fluctuations. A good theory would describe why the proposed fundamental sources lead both hours
and GDP to move together as well as labor productivity and GDP to move together. A good theory
would also describe why the same sources lead investment to fluctuate much more in percentage
terms than GDP and non-durable consumption to fluctuate in percentage terms a bit less than GDP.
82 Chapter 6. Business-Cycle Fluctuations
One might ask for a lot more but this would be a very good start. Thus, a theory could rely on
some exogenous elements (i.e. unexplained shocks) while having implications for the endogenous
variables (i.e. output, consumption, investment, hours and labor productivity).
Now let us think about this empirical fact within a simple model with an aggregate production
function.4 More specifically, let us think of models in which (i) output is produced by an aggregate
production function Yt = At F(Kt , Lt ) and (ii) neither technology At nor capital Kt changes over time.
4 The business-cycle fact table from the previous section organized the “facts” already around the concept of an
So far we have not made any explicit assumption on whether consumers are rational or irrational
or on what the sources of shocks are that lead to variation in labor input. We have ruled out, by
assumption, a role for technology shocks.
One can now ask whether the data produced by any theoretical model with these ingredients
will produce pro cyclical labor productivity. The answer will be no for production functions with
constant returns and diminishing marginal products. Times of high labor input will be times of
both high output and low labor productivity, other things equal. Figure 6.4 graphs two data points
(L1 ,Y1 ) and (L2 ,Y2 ) that are consistent with such a production function. Labor productivity is
simply the slope of the line segment connecting the origin to the data point (L1 ,Y1 ) or the origin
to the data point (L2 ,Y2 ). Clearly, the slope produced by the latter point is smaller than the slope
produced by the former point in Figure 6.4.
Under the stated assumptions, labor productivity must be low when output is high (e.g. output
level Y2 ) by the diminishing marginal product of labor. Thus, if the model has fluctuations in output
driven by fluctuations in aggregate labor then it also has to have countercyclical labor productivity.
This logic holds independently of the motivations behind consumer behavior - the consumers
populating such a model could be rational or irrational. It also holds independently of the source of
the variation in labor input.5
Now it is important to ask how one could alter the assumptions on technology so as to allow a
model with an aggregate production function to produce procyclical labor productivity. We consider
two possibilities. Each possibility in Figure 6.5 “explains” the same data as the two data points are
the same in Figure 6.5.
Possibility 1 is to allow the technology to change over time. This is displayed on the left-hand-
side of Figure 6.5. If labor input and technology move together in that their deviations from trend
are positively correlated, then times of high output will be times when both technology and labor
input are high. It can then be the case that high output is associated with high labor productivity. The
reason is that technology and labor have offsetting effects on labor productivity. More specifically,
higher labor decreases labor productivity other things equal but higher technology increases labor
productivity other things equal.
Possibility 2 is that technology does not change over time but that the production function
exhibits increasing returns in labor input alone. This is displayed on the right-hand-side of Figure
6.5. Thus, a single curve passes through both data points. To pass through both points (and the
origin) the marginal product of labor increases in the labor input. Then one can see that times of
high labor input are times of high labor productivity so that again labor productivity is procyclical.
The problem with this explanation is that it implies a form of increasing returns to scale strongly
at odds with that estimated in a large empirical literature (both at the aggregate level and at lower
levels of aggregation such as sector, industry or firm levels) on this issue.
In what follows, we will choose to maintain the assumption that there is an aggregate production
function with standard properties (e.g. constant returns and diminishing marginal products). These
are precisely the key properties assumed and analyzed in standard growth theory. Given this
decision, we have articulated only one way to escape from countercyclical labor productivity. Thus,
we will shortly analyze a theory of the business cycle where technological shocks are the only
source of fluctuations. This follows the line of work for which Finn Kydland and Edward Prescott
received the Nobel prize in 2004.
One might conjecture that many other shocks besides technology shocks could be potentially
5 The argument does make use of an other things equal assumption in that capital is held constant. Intuitively, with a
constant returns production function, if capital and labor both increase by the same percentage between two points in
time, then labor productivity is constant when output is high or low. This implies that what is happening to capital may
be important empirically for procyclical labor productivity. As long as the movements in capital input are smaller in
percentage terms than labor and not perfectly correlated, then the production function will not produce procyclical labor
productivity when the technology is constant returns to scale and technology does not change over time.
84 Chapter 6. Business-Cycle Fluctuations
relevant (e.g. wars, demographic shocks, changes in government policy, news about the likelihood
of future shocks or future policy changes, changes in uncertainty and animal spirits) for business
cycle fluctuations or, more broadly, for aggregate fluctuations. However, the assumption of an
aggregate production function with constant returns and diminishing marginal products and no
technological change effectively implies that theories built from any combination of these “other”
shocks will produce countercyclical labor productivity if they have their effects only through their
impact on the quantities of factor inputs. Thus, this line of reasoning which employs an aggregate
production function suggests a key role for technology shocks but does not imply that other sources
of shocks are unimportant.
We acknowledge that this line of reasoning relies heavily on the aggregate production function.
It is of course possible to build up theories of the production side of the economy by aggregating the
behavior of many small firms. When all such small firms have the same constant returns production
function and behave competitively, then the theory implies that the economy functions as if there is
one firm with an aggregate production function which is the same function as that of any of the
small firms. When all small firms do not have the same production function and there are frictions
that do not make it easy to reallocate capital and labor across firms, then this gives rise to a richer
menu of possible reasons for why aggregate labor productivity is procyclical. Such theorizing also
suggests that it is micro-level data that is key for analyzing sources of movements in aggregate
productivity. The analysis of such a framework and corresponding micro-level data is well beyond
the scope of this book.
6.3 Technology Shocks and Business Cycles 85
β
kt+1 = (1 − α)(1 − β )At kt
To see how the model works, consider a special and very simple case where the technology
86 Chapter 6. Business-Cycle Fluctuations
undergoes a one-time, permanent increase. At time t = 1 and all future periods the technology level
undergoes a permanent increase to level Ahigh which exceeds the previous level labeled Alow . We
assume that the capital stock at time t = 1 inherited from past decisions is initially at the steady
state level associated with the previous (lower) level of technology Alow . The upshot of these
assumptions is that the law of motion for capital shifts permanently upwards and that because of
this the capital stock increases over time until it converges to the new higher steady state level.
It is easy to figure out how the change in technology affects other variables. First, as the capital
stock increases there is an investment boom - recall that it = kt+1 − kt (1 − δ ). Second, the level of
β
GDP per unit of labor (labor productivity) increases over time - recall yt = At kt . At the time of
the technology improvement this is entirely due to technology term At . However, there is also a
delayed effect on yt due to the increase in the capital stock kt induced by the technology change.
β
Third, there is an increase in the real wage wt over time - recall wt = (1 − β )At kt . This increase
in wages occurs initially because of the increase in technology but is reinforced by the induced
increase of the capital-labor ratio kt .
One notable problem with this simple model as a model of business-cycle fluctuations is that
by assumption labor does not move in response to anything in the model. The young work all
the time regardless of the precise level of the wage rate. This could be addressed by allowing an
alternative use for the time of young agents. One way to do this is to allow agents to split their total
time allocation of one unit between work and leisure. Thus, agents would need to decide whether
to work more or less when wages are high. In the model at present young agents spend all their
time working as leisure by assumption is not valued as it does not enter the utility function. A
6.4 The Keynesian View 87
more plausible model would let utility depend upon leisure time as well as goods consumption.
This change would give rise to a much richer model that is the focus of the bulk of business-cycle
literature in the last several decades. This book lets you drive the Ford Model T, as its owners
manuel is easy to understand, while the Ferrari stays in the garage.
“Even apart from the instability due to speculation, there is the instability due to the charac-
teristic of human nature that a large proportion of our positive activities depend on spontaneous
optimism rather than on a mathematical expectation, whether moral or hedonistic or economic.
Most, probably, of our decisions to do something positive, the full consequences of which will be
drawn out over many days to come, can only be taken as a result of animal spirits - of a spontaneous
urge to action rather than inaction, and not as the outcome of a weighted average of quantitative
benefits multiplied by quantitative probabilities.” - Keynes (1936, Chapter 12, p. 161)
It is clear that Keynes did not view his hypothesized animal spirits to be a result of a rational
calculation. Nevertheless, some economists have tried to work out theories of economic fluctuations
based on a rational version of the “animal spirits" hypothesis. A first step in this effort was to
establish the logical conditions under which swings in business confidence are rational and yet
unrelated to changes in fundamentals (e.g. changes in production functions or changes in the
population of the economy) but still affect real economic activity.7
We note that even if economic fluctuations based on this type of mechanism occur in a model
economy, this does not mean that these fluctuations will be consistent with observation. For example,
earlier in this chapter we noted that any type of labor supply behavior, rational or irrational, within
the context of a model with an unchanging, constant returns aggregate production function will
produce counter-cyclical, labor productivity. In U.S. data labor productivity is procyclical. Thus,
simply adding in the logical possibility of animal spirits into the life-cycle model together with
a labor-leisure choice, without other changes to the framework, will not produce business-cycle
fluctuations with procyclical labor productivity.
What is the nature of consumer behavior in response to the fluctuations induced by animal
spirits? According to Keynes, the answer is that consumers can be viewed as applying a rule of
thumb. This view is embodied in the “Keynesian consumption function” that is sometimes taught
in introductory classes.
“The fundamental psychological law, upon which we are entitled to depend with great con-
fidence both a priori and from our knowledge of human nature and from the detailed facts of
experience, is that men are disposed, as a rule and on the average, to increase their consumption
6 See Keynes’s (1936) book “The General Theory of Employment, Interest and Money”. The page numbers in the
quotes from Keynes’ book are from the Harcourt Brace Jovanovich, 1964 Edition. This is a very difficult book to read.
Sir John Hicks wrote a 13 page article (see Hicks (1937), "Mr. Keynes and the Classics: A Suggested Interpretation"
Econometrica, 5, 147-59) that tried to reduce the content of Keynes’ book to two equations. A version of Hicks’
interpretation is what is typically presented as the Keynesian IS-LM model in intermediate macroeconomics textbooks.
7 See Cass and Shell (1983), Do Sunspots Matter?, Journal of Political Economy, 91, 193-227.
88 Chapter 6. Business-Cycle Fluctuations
as their income increases, but not by as much as the increase in their income.” - Keynes (1936,
Chapter 8, p. 96.).
This view contrasts with the optimizing view of consumer behavior summarized in the chapter
on dynamic consumer theory. The optimizing view is presented in all microeconomic textbooks.
It is the dominant view employed in current theoretical and empirical work in the consumption
and saving literature. Two separate Nobel Prizes were awarded to Milton Friedman in 1976 and to
Franco Modigliani in 1985 for the development of this work on dynamic consumer theory.
According to the optimizing view, the response of a specific household’s current consumption
to an increase in current household income is not determined by some invariant proportion of the
percentage change in current income from the previous period’s income. Instead, the optimizing
view implies that the response of current consumption to a surprise change in current income
would depend on the length of the remaining lifetime, on preferences and on whether the change
in income is expected to be permanent or temporary in nature. Indeed, the theory predicts that
the consumption response to a temporary increase in income should be smaller than the same size
permanent increase. This implication of the theory is largely borne out in surveys of the empirical
work on this issue.8
1. C + I + G = Y
2. C = a + b(Y − T )
a−bT +I+G
3. Implication: [a + b(Y − T )] + I + G = Y or Y = 1−b
for Public Policy, Journal of Economic Literature, vol. 48, pages 693-751.
6.4 The Keynesian View 89
above. It asserts that if for exogenous reasons investment I decreases then income Y also decreases.
Keynes’ view seems to be that investment is highly variable and is affected by the unexplained
animal spirits of investers.
Some economists and politicians try to apply this framework only when an economy is in a
recession or depression. Thus, Keynesian economics is sometimes termed "Slump Economics". A
typical argument is that there is extra labor that could be employed to increase income and output Y
as unemployment rates are high.
The issue is then what tools does a government have to increase Y ? We provide three answers.
All are based on the following relationship:
a − bT + I + G
Y=
1−b
A first answer comes from asking what is the impact on output of increased government
spending G without changing taxes T . It is straightforward to calculate the extra ouput ∆Y produced
1
by the model due to additional spending ∆G.9 The multiplier ∆G ∆Y
equals 1−b . It gives the impact on
income of increasing government spending by 1 unit. It is termed the unbalanced budget multiplier
because the extra spending is not required to the financed, in such a calculation, by extra taxes.
1 ∆Y 1
Unbalanced Budget Multiplier: ∆Y = 1−b ∆G implies ∆G = 1−b >0
1−b ∆Y 1−b
Balanced Budget Multiplier: ∆Y = 1−b ∆G implies ∆G = 1−b = 1
b ∆Y b
Tax Multiplier: ∆Y = − 1−b ∆T implies ∆T = − 1−b <0
A second answer comes from asking what is the extra income produced by increasing G and
at the same time increasing T by the same amount (i.e. ∆G = ∆T ). The answer is that income
increases exactly by the amount of the increase in spending and taxes as the balanced budget
multiplier is exactly equal to 1. The term balanced budget multiplier is not quite apt as what is
occuring in theory is the requirement that any additional spending be financed by extra taxes and
not that total spending equals total taxes.
A third answer comes from increasing taxes T , holding government G spending fixed. The
answer is that the multiplier is negative because taxes enter negatively in the expression for output
∆Y b
in the Keynesian model. The tax multiplier is ∆T = − 1−b .
The brief discussion above gives the simple prescription given by Keynesian economists (and
numerous politicians) for what to do when in a deep recession - increase spending, decrease taxes
or carry out a balanced budget increase in spending. While one can find some nuances to this
prescription in the editorial pages of many newspapers during the Geat Recession of 2008, it is
fair to say that this is a main thrust of Keynesian economics. In the fiscal policy chapter we will
revisit the issue of “multipliers”. The life-cycle model also has multipliers associated with specific
changes in tax-spending plans or specific changes in tax plans to finance a given spending plan. It
will be useful to contrast the dynamic multipliers in the life-cycle model with the multipliers arising
from Keynesian models.
One can find more complicated Keynesian models in intermediate-level textbooks. Perhaps
the most common model is refered to as the IS-LM model. This model has both a “consumption
function” and an “investment function” rather than solely a consumption function. The investment
function depends on an interest rate r - an endogenous variable. Thus, in the IS-LM model there
are two endogenous variables (Y, r). The style of analysis is the same as in the simple Keynesian
model - one traces out via algebra or graphical methods how changes in exogenous variables impact
9 The symbol ∆ is commonly used in science and mathematics to denote a difference or a change in a variable and
that is the way it is used here. Thus, ∆Y = Y new −Y old and ∆G = Gnew − Gold . In the Chapter on growth theory we used
a similar convention as part of a calculation of growth rates.
90 Chapter 6. Business-Cycle Fluctuations
(Y, r) and one figures out multipliers associated with changes in policy variables such as taxes or
spending.
xi - payout in event i
pi - probability of event i
u(xi ) - utility realized when event i occurs under gamble x
E[u(x)] ≡ ∑i u(xi )pi = u(x1 )p1 + u(x2 )p2 + · · · + u(xn )pn expected utility of gamble x
The theory assumes that individuals rank gambles by computing expected utility. Gamble x
is strictly prefered over gamble x0 precisely when it delivers higher expected utility. To compute
expected utility, the individual is assumed to have a utility function u(xi ) describing the (ex-post)
utility associated with any payout xi . Before the agent knows the outcome of the gamble the
agent asigns an expected utility (denoted E[u(x)]) to the gamble. This expected utility is simply
the realized utility averaged using the subjective probabilities. Thus, this calculation follows the
10 This question was posed and answered in a provocative book by Robert Lucas (1987) entitled "Models of Business
Cycles", Blackwell, Oxford. Here we follow in a simplified manner the main outlines of his approach to answering this
question. A Wikipedia discussion of this issue can be found by Googling "welfare cost of business cycles". Robert Lucas
received the Nobel Prize in 1995 for his work in macroeconomics.
6.5 Smoothing Out the Business Cycle 91
calculation of expected values from standard probability theory as can be found in any book on
statistics or any account of the theory of probability.
St Petersburg Paradox
To help understand why economists have adopted this theory, we take a small detour to discuss the
St Petersburg Paradox. This is a famous problem that goes back to the work of Daniel Bernoulli in
the 1700’s.11
Consider a simple gamble that is based on the toss of a fair coin. Let H denote heads and T
denote tails. It is understood that the probability of H and T are each one half. This simple gamble
pays off an amount which is equal to 2 to the power of the number of consecutive number of tails
thrown before the first toss of heads. Thus, this gamble offers a small probability of arbitrarily large
payouts.
If this gamble is offered on a one-time basis, then a common view is that very few individuals
would pay more than 100 dollars to accept this gamble. If a specific person is willing to pay a
maximum amount which is only strictly less than all the money they have, then this fact would rule
out one straightforward theory of behavior under risk. One theory of gambling behavior asserts that
one is willing to accept any gamble with a positive net expected payout. The key feature of the St
Petersburg gamble is that it has a gross expected payout which is infinite and thus the net expected
payout is also infinite no matter how much the individual pays to get this gamble!12 This can be
seen by computing the terms of the expected payout of this gamble as is carried out below. As each
individual term is equal to one half and there are infinitely many terms the gamble therefore has an
infinite expected payout.
To solve this equation for the compensation factor λ we need to take a stand on a useful utility
function u to analyze. Without specifying a utility function we can say that λ is positive provided
that there is diminishing marginal utility of consumption. Below we use the class of utility functions
known as “constant relative risk aversion" utility functions. These are indexed by the parameter
γ. The literature has established that γ measures the aversion to taking on proportional gambles.
Thus, the higher is γ the higher is the compensation required to take on such risk and the greater
the benefit to eliminating such risk. If one graphs this utility function it is clear that any value γ > 0
will be consistent with diminishing marginal utility.
1001−γ
⇒ (1 + λ )1−γ =
1021−γ (1/2) + 981−γ (1/2)
1001−γ
⇒λ =[ ]1/(1−γ) − 1
1021−γ (1/2) + 981−γ (1/2)
Table 6.3 below calculates the compensation factor λ for different values of the utility function
parameter γ. The results say that for the range of risk aversion coefficients considered the elimination
of aggregate risk is only equivalent to a very small proportional rise in consumption. This calculation
6.5 Smoothing Out the Business Cycle 93
was first carried out by Robert Lucas in a slightly different way. His main conclusions are similar
to those below.13
Lucas took the position that a risk aversion coefficient of more than 10 is not reasonable. The
calculations based on γ = 4 indicate that the maximum gain to perfectly eliminating aggregate
fluctuations is less than a tenth of one percent of consumption. Aggregate consumption per person
in the U.S. economy in 2010 is about 33 thousand dollars (i.e. 10.2 trillion dollars divided by 308
million people). This number times the value of λ from Table 6.3 gives a free lunch worth $26.50
dollars per person every year from eliminating aggregate fluctuations. This number seems very
small.
One thing that is missing from these calculations is a sense of how averse actual individuals are
to taking on risk. While the literature is full of different methods for using data to get an idea of the
magnitude of risk aversion, we will not discuss this evidence. Rather we will provide a table which
may help you to understand how your own aversion to taking on risk is related to the parameter γ
of the utility function used above.14
We now ask you the following question: what fraction of your wealth are you ready to give up
to escape the risk that you gain or lose a fraction α of your wealth with equal probability? Table 6.4
gives the answer to this question using the class of constant relative risk aversion utility functions
u stated earlier. Thus, you can use Table 6.4 in two ways. First, you can pick the row which best
refects your attitudes to these two gambles and then see the risk aversion coefficient γ that would
produce these attitudes. Second, gvien a value of γ you can see how much wealth such a theoretical
agent would be willing to give up not to take these gambles. For example, if you are willing to give
up two percent of your wealth to avoid the α = 10% wealth gamble and sixteen percent of your
wealth to avoid the α = 30% wealth gamble, then your behavior is consistent with γ = 4.
The answer that most people would give for the fraction of wealth given up to avoid these
gambles is consistent with a γ value less than 10. If this characterizes your answer, then there are two
possible conclusions to have for the answer to the welfare gain to eliminating aggregate fluctuations.
One possibility is that some important feature of actual economies is missing from this calculation
so that the range of answers tabulated is not relevant and must await further analysis. Along this
line, an important literature in economics notes that the total consumption fluctuations experienced
by individual households is much larger in percentage terms than the observed fluctuations in
aggregate consumption data. Thus, an important debate centers on how any possible smoothing
of aggregate consumption impacts both the level of household consumption and the smoothing of
total consumption risk faced by individual households. The other possibility is that the welfare gain
to perfect business-cycle smoothing is at most a fraction of a percent of consumption each year.
Under this possibility, smoothing out the business cycle does not seem to be a big deal.
13 One difference was that instead of a two point distribution of risk, he analyzed the case where consumption risk x is
lognormally distributed. This leads to a simple approximation formula where the compensation is proportional to risk
2
aversion γ and to the variance σ 2 of the log of consumption risk: λ = γ σ2 . This gives results which are close to the
computations in the table when one sets σ 2 = .022 = .0004 so that a one standard deviation shock moves consumption
by two percent.
14 The table comes from Christian Gollier (2001, p.31) “The Economics of Risk and Time”, MIT Press.
94 Chapter 6. Business-Cycle Fluctuations
One of the most important indicators in fiscal policy analysis is the ratio of government debt to
GDP, or the debt-GDP ratio. Figure 7.1 plots the U.S. debt-GDP ratio whereas Figure 7.2 plots the
same ratio for the United Kingdom.1 There are six US episodes in which this ratio rises by at least
20 percent in a small number of years.
The first four of these US episodes are (1) the Civil War, (2) World War I, (3) the Great
1 The US data come from Henning Bohn, "The Sustainability of Fiscal Policy in the United States" (in: R. Neck and J.
Sturm, "Sustainability of Public Debt", MIT Press 2008, pp.15-49). The data from his work is updated to include 2012
data. The data for the UK come from the Bank of England.
96 Chapter 7. Fiscal Policy
Depression and (4) World War II. The fifth US episode started around 1980 and is associated with
the label “Reaganomics”.2 The sixth and last episode is the rise in the ratio which accompanied the
recession that begun in 2008, known as the Great Recession. Both Figures show that major wars
are in practice financed largely by issuing more government debt historically in the US and the UK.
UK: Debt‐GDP Ratio (%)
300.0
250.0
200.0
150.0
100.0
50.0
0.0
1700 1750 1800 1850 1900 1950 2000
Year
UK: Debt‐GDP Ratio (%)
Definition 7.1.1 The Law of motion for debt states that future debt is equal to current debt
plus the deficit: Bt = Bt−1 + Dt = Bt−1 + [Gt − Tt + rt Bt−1 ]. Bt is real debt at time t. Dt =
Gt − Tt + rt Bt−1 is the deficit at time t. Gt , Tt are government spending and net tax collected at
time t and rt is the real interest rate at time t. The primary deficit is PDt = Gt − Tt .
When we apply this framework in later parts of this chapter, we will be thinking of a government
that issues default-free debt. This may describe some countries over some time periods but clearly
not all countries. In a number of countries the real market value of government debt can change by
virtue of default or by the anticipation that there has been a change in the probability that a future
default will occur. A debt default occurs when the country announces that it will miss one or many
of the scheduled payments on part or all of its debt. The experience of Argentina over 1999-2002 is
a good example of a country where government debt is viewed as subject to a potential default and
2 One
account of this episode is presented in The Triumph of Politics: Why the Reagan Revolution Failed,by
David Stockman. Stockman was appointed to be the director of the Office of Management and Budget in the Reagan
administration.
7.1 Accounting Framework 97
where the market value of government debt experienced dramatic fluctuations. The experience of
Greece and Ireland in 2010 are also good examples of countries whose debt is viewed as subject
to a potential default. The nominal interest rate implicit in the pricing of Greek government debt
was several percentage points above the comparable interest rate on German government debt. The
analysis in this chapter views govenrment debt as default free. This is a natural first step in building
theory that incorporates government spending, taxation and debt.
We will now use the accounting equation to decompose how the U.S. debt-GDP ratio fell after
World War II. Step 1 below divides the accounting equation by GDP. Step 2 expresses the debt each
year as a ratio to GDP in that same year. Step 3 puts the change in the debt-GDP ratio on the left
hand side and three separate terms that add up to it on the right hand side.
Bt Bt−1 PDt
= (1 + rt ) + (7.1)
Yt Yt Yt
Bt Bt−1 Bt−1 Yt−1 Bt−1 PDt
− = (1 + rt ) − + (7.2)
Yt Yt−1 Yt−1 Yt Yt−1 Yt
Bt Bt−1 1 + rt Bt−1 PDt
− = −1 + (7.3)
Yt Yt−1 1 + gt Yt−1 Yt
Equation 7.3 says that a change in the debt-GDP ratio, such as the fall in the debt-GDP ratio
after World War II, can be attributed to two main sources. The first source is a primary surplus,
captured by the term PD
Yt < 0.A negative
t
primary deficit is a primary surplus. The second source is
1+rt Bt−1
composed of all other terms, 1+gt − 1 Yt−1 . These terms encompass both the interest rate effect
and the growth effect. This term can be negative, leading to a reduction in the debt-output ratio,
when the GDP growth rate exceeds the interest rate.
Change in US Debt‐GDP Ratio
0.15
0.1
0.05
0
1940 1950 1960 1970 1980 1990 2000 2010 2020
‐0.05
‐0.1
‐0.15
Change in Debt‐GDP Ratio Primary Deficit
Figure 7.3 provides this decomposition. It plots the change in the US debt-GDP ratio in blue.
There is a big decrease in the debt-GDP ratio in the 1950’s so these values are negative. Figure 7.3
98 Chapter 7. Fiscal Policy
also plots the year-by-year values for the primary deficit as a ratio to GDP in red. We see that the
blue line is typically below the red line from 1950 to 1980. Thus, there is a remaining
source of
1+rt Bt−1
the fall in the debt-GDP ratio. Equation 7.3 says that this remaining source is 1+g t
− 1 Yt−1 and
Figure 7.3 implies that this term is on average negative from 1950 to 1980 because the blue line is
below the red line. This term can only be negative on average if the real return on US government
debt is below the growth rate of GDP.
Figure 7.3 suggests that for part of the post WWII period the growth rate of GDP exceeded the
average interest rate paid on government debt. This finding is an uncomfortable one for economists
equipped with a simple growth model. The reason is that the interest rate is below the growth rate
in such models exactly when the economy is above the Golden Rule capital-labor ratio. Economists
do not believe that the U.S. economy or any advanced economy is above the Golden Rule and is
suffering from having too much capital as discussed in Chapter 3. Thus, a common view is that a
more sophisticated theory that allows a role for aggregate risk to impact growth rates and returns to
assets is needed to adequately interpret this last fact.
Bt = Bt−1 Rt + Gt − Tt (7.4)
Bt+1 = Bt Rt+1 + Gt+1 − Tt+1 (7.5)
Substitute Bt from 7.4 into the right hand side of 7.5, after dividing 7.5 by Rt+1 . This produces
equation 7.6.
Bt+1 Gt+1 − Tt+1
= Bt−1 Rt + Gt − Tt + (7.6)
Rt+1 Rt+1
This procedure is similar to that used to derive the present value budget constraint used in consumer
theory. Since the government lives indefinitely, one can apply the same procedure to sequentially
express the period constraints corresponding to time t + 2, t + 3, and so on, up to period t + n in
terms of initial debt and the sequence of primary deficits between period t and t + n. At each step,
one takes the present value of both sides of a given future period constraint, and then substitutes
debt from last period using the expression for accumulated deficits. For example, for debt in
period t + 7 one takes the period budget constraint Bt+8 = Bt+7 Rt+7 + Gt+7 − Tt+7 , divides by the
product of the period discount rates Rt+1 Rt+2 ...Rt+7 substitutes debt Bt+7 from the period-7 analog
of equation 7.6. One can see that this operation, repeated from period t + 1 through n − 1 leads to
equation 7.9, below:
Bt+n Gt+1 − Tt+1 Gt+2 − Tt+2 Gt+n − Tt+n
= Bt−1 Rt + Gt − Tt + + + ... +
Rt+1 Rt+2 ...Rt+n Rt+1 Rt+1 Rt+2 Rt+1 Rt+2 ...Rt+n
(7.7)
To obtain the present value budget constraint of the government we need two more steps.
The first step is to assume that the term Rt+1B···R
t+n
t+n−1
approaches zero as the number n of periods
7.3 Fiscal Policy in the Life-Cycle Model 99
we look into the future increases. This assumption puts some limitations on how fast debt can
grow. More precisely, it assumes that the debt (eventually) grows at a slower rate over time than
the rate of interest. For example, this assumption precludes the government from “rolling over”
existing debt forever. If that were to hold and primary deficits were held to zero, then the debt
would grow at exactly the rate of interest. The second step is to reorganize the equation so that
government outlays (expenditures and interest and initial debt plus interest) are on the left hand
side and government revenues (taxes) are on the right hand side. The following definition contains
this budget constraint.
Definition 7.2.1 The Present Value Budget Constraint of the Government is as follows:
Because of the assumption that the government is infinitely-lived, the present values involved
in the previous definition involve infinite sums.
We can offer two interpretations for what this budget constraint implies. The first interpretation
is that current and future taxes must pay for both current debt as well all current and future spending.
The second interpretation is perhaps more interesting. It is the assertion that a tax cut without
an eventual spending cut is not really a tax cut. To illustrate this interpretation, consider two
hypothetical plans: a status quo plan and a new tax and spending plan. Suppose that the new plan is
viewed as offering a tax cut in that current taxes are lower under the new plan: Ttnew < Tt . If the new
plan satisfies the government budget constraint and the left-hand side does not change (no change
in spending and no defaulting on the debt), then the right-hand side cannot change. It has to have
the same present value. So a tax cut in period t is not really a tax cut as taxes in some future period
must be raised. This interpretation highlights the point that political pronouncements concerning
proposed tax cuts which are not clearly related to spending cuts may be inconsistent with the present
value budget constraint. Regardless of what interpretations we assign to the present-value budget
constraint, we will view it as a basic restriction on tax, spending and debt plans in that they must
add up so as to satisfy this equation.
We will now work out the law of motion for the capital-labor ratio. We do this first without
a government and then with a government that can tax, spend and borrow. We will then see how
taxing, spending and borrowing affects the law of motion. The first equation below states that,
without a government and with α = 0, young agents save all their wages. The logic is that the
government does not take away any of the young agent’s wages in taxes and that they only care
about consumption when old and thus young agents save everything they can. With α = 0 young
100 Chapter 7. Fiscal Policy
agents save all their wages and the law of motion of the life cycle model without government is
simply:
β
kt+1 = wt = (1 − β )Akt
. When there are taxes, young agents save all of their wages after taxes, where wages after taxes are
denoted by the bracketed term (wt − Tyt ).
When there is a government, young agents have two forms in which to hold their savings:
risk-free government debt b and physical capital k.3 Thus, the part of their savings that is held in
physical capital is all their savings (wt − Tyt ) less the part held in government debt bt .
The law of motion for the capital-labor ratio with a government is thus given by:
β
kt+1 = (wt − Tyt ) − bt = ((1 − β )Akt − Tyt ) − bt
The law of motion says that the capital-labor ratio next period equals the wage per young agent
less taxes on the young and less the amount of government debt per young agent issued by the
government.
Note that the timing convention used for government debt is different to that used for the
capital stock. Bt is total debt to be repaid at time t + 1 (accumulated during t) while Kt+1 is capital
accumulated during t, which produces output in t + 1. We have this dual definition to (i) consistency
with the decomposition above, (ii) to illustrate the two conventions.
k t +1 k t +1 = k t
kt +1 = (1 - b ) Aktb
kt +1 = (1 - b ) Aktb - Tyt - bt
kt
The term kt+1 is the capital-labor ratio whereas the term bt is the government debt to labor ratio.
The terms (kt+1 , bt ) can also be interpreted as the capital and debt per young agent.
Figure 7.3 plots a diagram describing the law of motion. This follows the treatment of this law
of motion from the chapter on the life-cycle model. The law of motion without a government is
closely related to the law of motion with a government. They differ by a vertical shift capturing the
3 Inthe model young agents are indifferent between holding government debt or physical capital provided both assets
pay the same return. Thus, when we analyze government debt we will assume both assets have the same risk-free return
and that both assets are riskless. Clearly, we abstract from risk to simplify the theory.
7.4 Three Ways to Finance a War 101
role of taxing young agents and borrowing from young agents. Both taxing and borrowing take
goods out of the hands of young agents that otherwise would have been converted into physical
capital. The graph shows that now there is more than one steady state with a positive capital-labor
ratio.
Tot+1 0
Policy 1 : PV Tax = Tyt + = g+ =g
1 + rt+1 1 + rt+1
What we have just seen is an illustration of an important principle. Economists call this Ricar-
dian Equivalence. More specifically, economists note that many seemingly different government
policies can lead to exactly the same implications for outcomes. Two policies leading to the
same consequences are said to display Ricardian Equivalence. The principle behind the Ricardian
Equivalence result displayed here is that there are many seemingly different tax policies that still
impose the same present value of taxation on each agent, while keeping government spending on
goods the same across the policies. Two such policies then leave budget sets unchanged. Thus,
regardless of the nature of an agent’s preferences, rational choice then implies that consumption
choices must be unchanged across two such policies. It is clear that debt or asset choices are
changed but the all important consumption choices are unchanged.
7.5 Multipliers
In the Great Recession of 2008, policy makers in many high-income countries were considering
policy responses to lift their economies out of recession. One policy response is a temporary
change in government purchases or a temporary change in some taxes. Dynamic spending or tax
“multipliers” describe the ratio of the change in output ∆Yt+n at some horizon n due to an increase
in government spending ∆Gt or taxes ∆Tt in time period t. This section does two things. First,
the logic behind an empirical attempt to estimate dynamic multipliers will be explained. Second,
theory-based multipliers from two very different theoretical models will be calculated.
7.5 Multipliers 103
7.5.1 Empirics
This section describes the basic ideas behind the calculation of tax multipliers in applied work. In
doing so we employ the basic ideas presented in the work of Romer and Romer (2010).4
Consider the first equation in the statistical model below. It relates the change in output
∆Yt = Yt −Yt−1 at time t to the change in (legislated) taxes ∆Tt = Tt − Tt−1 at time t. The statistical
model posits that the change in output is a linear function of the change in taxes plus a disturbance
or shock term εt . If we measure Yt as the log of GDP and Tt as the log of taxes (both at time t), then
∆Yt is the output growth rate and ∆Tt is the growth rate in taxes. The second equation acknowledges
that there is possibily a long list of factors that may impact output, other than taxes. Adding these
up produces the disturbance εt = ∑Ki=1 εti = εt1 + · · · + εtK .
The third equation posits that the change in (legislated) taxes that is measured in the data is due
to two sources. One source ∑Ki=1 bti εti is that part of the change in taxes occurs in response to the
same sources of variation leading to changes in output - the vector (εt1 , · · · , εtK ). They call this part
the “endogenous tax changes”. For example, policy makers may systematically reduce some taxes
when the economy enters a recession and output is low. The other source of variation captures the
possibility that some legislated tax changes may not be done in response to variables ( the vector
(εt1 , · · · , εtK )) that directly impact output. This is the key assumption in their analysis. They call
this part “exogenous tax changes” and it is represented by the term ∑Lj=1 ωtj .
Now combine the three equations above into equation (*) below:
L K K
(∗) ∆Yt = α + β ( ∑ ωtj ) + [β ∑ bti εti + ∑ εti ]
j=1 i=1 i=1
Equation (*) is key in the analysis of Romer and Romer (2010). They use the “narrative method”
to try to find and calculate in quarterly US data the change in legislated federal taxes (the term
j
∑Lj=1 ωt in equation (*)) that is not in response to the factors directly impacting output. They then
use linear regression methods to estimate versions of this equation under the assumption that the
calculated change in legislated tax term is uncorrelated (i.e. not systematically related) with the
disturbance term in square brackets.5 An estimate of the parameter β is then their estimate of the
contemporaneous “government tax multiplier”.
4 See Christina Romer and David Romer (2010), The Macroeconomic Effects of Tax Changes: Estimates Based on a
New Measure of Fiscal Shocks, American Economic Review, 100, 763-801.
5 The technique of linear regression, taught in an econometrics course or a statistics course, produces an unbiased
How do Romer and Romer (2010) try to separate legislated tax changes into endogenous and
exogenous parts? They give two examples of exogenous tax changes. The Clinton era tax increase
of 1993 is an example where they claim that taxes were raised not because policy makers felt
that the economy needed to be restrained but because policy makers felt it was prudent and might
increase long-run growth. Thus, roughly put, the tax change was not in response to current shocks
impacting output. The Kennedy-Johnson era tax cut of 1964 is an example where the claim is that
it was put in place to help long-run growth. They arrive at this clasification by reading various
accounts in government publications of the motivation behind the tax changes - hence they label the
method the “narrative method”. An example of an endogenous tax change is a change motivated by
smoothing out a recession. For example, Romer and Romer put the tax cut of 1975 in this category.
Presumably, the temporary payroll tax cut associated with the Great Recession of 2008 would be
classified as endogenous.
The results reported in Romer and Romer (2010) are based on a more elaborate version of the
simple statistical model described above. They employ the equation ∆Yt = α + ∑M i=0 βi ∆Tt−i + εt
which allows tax changes in the current period ∆Tt and previous periods ∆Tt−i for i ≥ 1 to impact
current output changes. The variable ∆Tt−i is measured using the “exogenous part” of tax changes
from their narrative approach. Thus, it corresponds to the term ∑Lj=1 ωtj used in the simple statistical
model.
M
∆Yt = α + ∑ βi ∆Tt−i + εt
i=0
Their estimate of the multiplier n quarters after a exogenous tax increase equal to 1 percent of
GDP is given by summing the first n coefficients so that the dynamic multiplier at horizon n quarters
ahead is β0 + β1 + · · · + βn . Romer and Romer (2010, Figure 4) find that (i) the contemporaneous
multiplier is zero (i.e. β0 = 0), (ii) the multiplier falls as the horizon n increases and (iii) the
multiplier is approximately −3.0 eight quarters ahead when n = 8. Thus, they find that exogenous
tax increases are contractionary in US data. They lead to decreases in output of greater size, eight
quarters ahead, than the size of the tax increase because the sum of the estimated coefficients is
below −1.
Keynesian Model
We have already worked out the tax multipier from the Keynesian model from Chapter 6.4. The
first equation below is output in the Keynesian model. The second equatoin is the tax multiplier
that is implied by the first equation. Thus, the theory says that the contemporary effect of an
increase in taxes, holding spending constant, is to decrease output. There are no dynamic effects as
the Keynesian model is static in that variables at different time periods do not enter the equation.
The empirical multiplier from Romer and Romer (2010) indicated that the contemporaneous tax
multiplier was zero but the tax multiplier was −3.0 eight quarters ahead.
a − bT + I + G
Y=
1−b
7.5 Multipliers 105
∆Y b
=− <0
∆T 1−b
Life-Cycle Model
To generate multipliers in the life-cycle model we compare a benchmark scenario to an alternative
scenario. The alternative scenario features an increase in goverment spending in the initial period -
say period t = 0. We calculate output paths over time in both scenarios. If the output is greater in
period t in the alternative scenario compare to the benchmark scenario so that ytalter − ytbench > 0,
then we say that the government spending multiplier is positive in period t.6 We want to figure out
whether multipliers are positive or negative at different time horizons.
We keep the benchmark scenario simple by assuming that the economy is in a steady state with
no government spending, no taxes and no debt. Thus, the output-labor ratio and the capital-labor
ratio are constant over time in the benchmark scenario. The alternative scenario entails positive
government spending galter0 > 0 in period zero but no government spending in all future periods (i.e.
gtalter = 0 for t ≥ 1). Thus, we analyze a one-time increase in spending. We assume for convenience
that the government spending is not a substitute for private consumption (e.g. funding NASA to
send a space craft to Mars is probably not a close substitute for restaurant meals). This assumption
is consistent with those that Keynesian economists have often found to be convenient.
Following the logic of the model, any government spending plan has to be associated with a tax
plan that satisfies the government’s intertemporal budget constraint. There are many tax plans to do
this. Each tax plan will have associated with it a whole bunch of government spending multipliers
at different time horizons because the output path depends, in general, on the tax plan. We pick a
simple plan and set Ty = To = galter
0 /2 at time t = 0 with no taxes on any young or old agent at any
future date. Thus, the government runs a balanced budget at all points in time and at time t = 0 the
young and old agents equally share the cost of government spending.
What happens in the alternative scenario? To figure this out consider what happens to the law
of motion in the alternative scenario compared to the benchmark scenario. The general equation
for this law of motion is provided below and was graphed previously in Figure 7.3. It is clear that
the law of motion graphing next periods capital-labor ratio as a function of this periods capital
labor ratio shifts downward for exactly one period (in period t = 0) and then shifts back to the
original position. This occurs as young agents under the alternative scenario have less to save as the
government is now taxing them Ty = galter 0 > 0 in period t = 0. The government does not borrow
from the young so bt = 0 in all periods.
β
kt+1 = wt − Tyt − bt = (1 − β )Akt − Tyt − bt
The consequence is that in period t = 1 the capital-labor ratio is smaller in the alternative
scenario than in the benchmark. Thus, output is lower in the alternative scenario compare to the
benchmark as ytalter = F(ktalter , 1) < ytbench = F(ktbench , 1). Over time the capital-labor ratio in the
alternative scenario converges to the level in the benchmark scenario from below. Thus, output in
the alternative scenario also converges over time to that in the benchmark scenario from below. The
outcome is that the government spending multiplier is negative at horizon t = 1, 2, ... and is exactly
zero at horizon t = 0 in the life-cycle model. In the Keynesian model the multiplier associated with
a balanced-budget spending increase was positive. More specifically, the multiplier at horizon t = 0
in the Keynesian model is positive. The empirical results in Romer and Romer (2010) were that the
contemporaneous multiplier for taxes was zero and that the multiplier was negative for horizons
several quarters ahead.
6 One could also calculate a government spending multiplier at horizon t as (ytalter − ytbench )/(galter − gbench ).
0 0
106 Chapter 7. Fiscal Policy
The only way to get a positive multiplier from a tax increase in the life-cycle model is to get
labor or capital inputs to increase beyond their levels in the benchmark model. The model has
young agents always supplying the same amount of labor input regardless of the wage rate. Thus,
without a substantial change to the model ( a change that alters attitudes to leisure and labor) the
sign of the multiplier boils down to how the taxes that finance government spending affect saving.
The young agents at time t = 0 are now worth less over their lifetime as the government taxes them.
Thus, they save less than in the benchmark scenario. This leads to negative spending multipliers at
dates t = 1, 2, ....
We acknowledge that allowing for a meaningful labor-leisure choice is an important theoretical
route to positive government spending multipliers. Higher spending financed by higher (lump-sum)
taxes may lead agents to work more and produce more output when leisure is a normal good. The
mechanism is that the taxes make agents poorer when the government spending is not a substitute
for private consumption. If leisure is a normal good then the mechanism is that taxes make agents
poorer and this leads to less leisure and more labor, which helps to increase output.
long as the transfer they receive in the existing system exceeds the transfer in the alternative system.
This is because, in the model, the payment to capital which the old agents recieve only depends
on the total quantities of factor inputs (K, L) and these payments are not altered by whether or not
the old agents receive a transfer. The policy preferences for the young agents at different dates are
trickier to determine but can be worked out for some simple policies.
To understand the consequences of moving to policy (Tyt , Tot ) = (s, −s) from a policy of no
taxation, no government transfers, no spending and no debt , we highlight how the new policy
shifts the law of motion for the capital-labor ratio. The law of motion is kt+1 = (wt − Tyt ) − bt =
β
((1 − β )Akt − Tyt ) − bt . The insight from the previous section is that government policy shifts
down the law of motion by exactly the taxes on young agents Tyt plus the borrowing from young
agents bt . Thus, under the pay-as-you-go social security system the law of motion shifts down by
exactly Tyt = s > 0.
k t +1 k t +1 = k t
kt +1 = (1 - b ) Aktb
kt +1 = (1 - b ) Aktb - s
kt
The theory then predicts that over time this causes the capital-labor ratio to fall to a lower
steady-state level. Intuitively, this raises the utility of currently old agents at the time of the policy
change but lowers the utility of all agents born far in the future. These agents are born into a world
with lower capital, wages and output. For this intuition to prove correct within the model it is
important that the initial steady state is below the Golden rule steady state. As long as this holds
then the new steady state will not only have lower capital, wages and output but will also have
lower consumption.
It is useful to try to get an intuitive idea of what social security does in this model. One way to
do so is to calculate the present value of taxation on an agent. We do so below. The calculation
shows that a pay-as-you-go system is equivalent to a net tax in present-value terms exactly when
the real interest rate is positive. Thus, from this perspective the social security system within this
model amounts to providing a gift of s > 0 to each initial old agent that is paid for by imposing a
net tax on all current young agents and on all agents born in the future.
Tot+1 −s srt+1
PV Tax = Tyt + = s+ = >0
1 + rt+1 1 + rt+1 1 + rt+1
108 Chapter 7. Fiscal Policy
A second way to view social security in this model is to say that social security is equivalent to
the government forcing all agents, other than the initial old agents, into giving the government a
zero real interest rate loan. This interpretation comes from the point that agents give up s when
young and get back s when old so that no “interest” is paid in the model social security arrangement.
It also comes from the term srt+1 in the numerator of the equation above. This is the interest not
paid by the social security system. Clearly, when the market interest rate is positive, these agents
would have no benefit from participating in this scheme!
One can ask whether a pay-as-you-go system is a good idea in some normative sense within the
model. Although it is difficult to win battles on normative questions, it is important to discuss them.
A pay-as-you-go system in the life-cycle model helps some agents but hurts others. So moving
to this system does not result in a Pareto improvement as long as the real interest rate without the
system is positive each period. This conclusion about Pareto efficiency is a consequence of the
Proposition established in section 4 of Chapter 5. Thus, the life-cycle model does not give a ringing
endorsement for adopting social security systems on welfare grounds.
It is key to point out that the life-cycle model does not have any risks facing agents that the
model social security system helps to insure. The absence of risk is one reason for the relatively
simple conclusions reached for how social security impacts the economy. Recall that social security
is essentially a way to redistribute wealth across generations within this model. Some generations
get a positive transfer, whereas others get a negative present-value transfer. Social security plays no
insurance role in the life-cycle model.
Some economists have argued that there is a welfare argument to be made for government
provision of insurance. For example, James Mirrlees (1995, p. 384) provides two reasons for
government provision:7
From the point of view of insurance, there seem to me to be two compelling theoretical
arguments for having the State rather than the market provide a wide range of insurance, for
old-age pensions, disability and sickness, unemployment and low income: the first is that the
market handles adverse selection badly. The second is that, even if adverse selection were
not important, people should take out insurance at an age when they are incapable of doing
so rationally, namely zero.
7 Mirrlees (1995), Private Risk and Public Action: The Economics of the Welfare State. European Economic Review
39: 383-97. Mirrlees received the Nobel Prize in Economics in 1996 for his work on optimal income taxation.
7.7 Overview 109
The tax rates funding the various benefits have increased over time. There is a proportional tax
rate funding OASI benefits (10.6 percent), DI benefits (1.8 percent) and HI benefits (2.9 percent).
These are the total tax rates - half this tax is paid by the employee and half by the employer. These
tax rates apply up to an upper bound or cap on earnings. Workers who have earnings above the cap
pay no extra taxes on the portion of earnings beyond the cap. The earnings cap on OASIDI taxes
was 106,800 dollars in 2010. There is currently no earnings cap on HI taxes. After the recession of
2008, the combined OASIDI tax rate described above was temporarily lowered for a period of two
years.
The benefits listed in Table 7.1 are largely funded on a pay-as-you-go basis. A system that
is funded purely a pay-as-you-go basis collects taxes on current workers and pays out all taxes
to current beneficiaries. Thus, under such a system there is no trust fund built up to pay current
workers future benefits. The U.S. system has a small trust fund built up for each distinct benefit
that are being rapidly drawn down. The sense in which these trust funds are small and the speed
with which they are being drawn down can be gauged from reading the annual Trustees Reports for
the Social Security and Medicare Systems. These trust funds hold only U.S. Treasury debt. As one
part of the Federal Government of the U.S. is holding the debt of another part of the government,
there has been a debate about whether or not such assets should be viewed as “backing” for future
benefit payments.
The Trustees Reports in each year from the last several decades has forecasted that, without
changes in tax rates or benefit formulas, there will be an impending crisis. This report forecasts
the year in which each trust fund will run out of money. Forecasts from 2011 indicate that the
combined OASIDI trust fund will be exhausted by 2036. The problem is that benefit payments as a
fraction of GDP or total earnings are projected to increase over time, whereas tax payments implied
by current and future tax rates are projected to be stable as a faction of GDP or total earnings. Thus,
once the trust funds are exhausted annual tax revenues will be only a fraction of forecasted annual
benefit payments. The trend in benefits and taxation is driven by demographic change - people are
living longer and the birth rate is not projected to increase. Thus, the ratio of workers per retiree
has declined over time and is projected to continue this trend.
7.7 Overview
We take away three main fiscal policy lessons from theory.
1. Changes in government spending can in theory have a powerful impact on the economy. This
was illustrated in section 7.4 by comparing the economy with no government spending to
the economy with a constant positive amount of government spending that was financed by
always taxing young agents. This change in government spending was contractionary in that
it reduced output and aggregate physical capital in all future periods.
2. Different tax policies that finance the same government spending plan over time can have
exactly the same impact on the economy. The principle underlying this result is called
Ricardian Equivalence. Two policies that finance the same spending plan will have the same
impact when they impose the same present value of taxation on each household. This situation
was illustrated by Policy 1 and Policy 3 from section 7.4. This result holds independently of
the nature of the utility function for agents in the life-cycle model, but the argument for the
110 Chapter 7. Fiscal Policy
result does utilize the assumption that taxes are lump-sum taxes and not taxes that depend on
income (e.g. proportional or progressive income taxes).
3. Shifting the burden of paying for a fixed government spending plan onto future generations is
contractionary in the life-cycle model. This point was illustrated by comparing Policy 1 and
Policy 2 in the war finance example. It was also illustrated in comparing no social security
system to a pay-as-you-go social security system. Thus, starting up a new pay-as-you-go
social security system in the life-cycle model amounts to shifting tax burdens onto future
generations and away from the current generation.