Вы находитесь на странице: 1из 110

Macroeconomics:

A Growth Theory Approach


Alejandro Badel
Mark Huggett

First printing, December 2016


Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1 Some Questions in Macroeconomics 7
1.2 The Role of General Equilibrium Theory 7
1.3 Microfoundations 8
1.4 Expectations 8
1.5 Why adopt a Growth Theory Approach? 8

2 Measurement of Output and Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11


2.1 Output and Income Accounting 11
2.2 Expenditure vs. Value Added Method 14
2.3 GDP vs. GNP 14
2.4 GDP in Simple Example Economies 14
2.5 US NIPA Facts 16
2.6 Comparing GDP Across Countries 19
2.7 Price Indexes 20
2.8 Cost-of-Living Index 21
2.9 Key Concepts 24

3 Growth Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.1 Introduction 25
3.2 Production Function 25
3.2.1 Relative Wages in US Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4

3.3 Solow Growth Model 32


3.3.1 The Basic Solow Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.3.2 The Full Solow Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.4 Is the Solow Model Consistent with Kaldor’s Facts? 40
3.5 Cross Country Comparisons 41
3.6 Growth Accounting 43
3.6.1 Growth Accounting: Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.6.2 US Economy: 1909-49 and 1949-2016 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.6.3 The Asian Growth “Miracle” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.7 Golden Rule 49
3.7.1 Bad Allocations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.7.2 Observable Implications of Bad Allocations . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.8 Key Concepts 53

4 Dynamic Consumer Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55


4.1 Static Consumer Theory: Two Good Case 55
4.2 Dynamic Consumer Theory: Two Periods 57
4.3 Dynamic Consumer Theory: Many Periods 59
4.3.1 Lifetime Histories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.4 Some Uses Of The Model 62
4.4.1 Consumption Patterns Over The Life Cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.4.2 Consumption Responses to Temporary vs Permanent Shocks . . . . . . . . . . . . . 63
4.4.3 Savings Rate Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.5 Overview 66
4.6 Key Concepts 66

5 Life-Cycle Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.1 Benchmark Model 68
5.2 How the Benchmark Model Works 68
5.3 Analyzing a One-Time Shock 71
5.4 Can Model Allocations Be Improved? 72
5.5 Review of Marginal Conditions 75
5.6 Life-Cycle Model: One Downside to the Simple Formulation 75
5.7 Key Concepts 76

6 Business-Cycle Fluctuations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6.1 Business-Cycle Facts 77
6.2 Outlines of an Unsuccessful Theory 82
6.3 Technology Shocks and Business Cycles 85
6.3.1 A Model with Technology Shocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.4 The Keynesian View 87
6.4.1 A Simple Keynesian Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5

6.5 Smoothing Out the Business Cycle 90


6.5.1 Expected Utility Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6.5.2 Gain to Eliminating Business Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.6 Key Concepts 94

7 Fiscal Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
7.1 Accounting Framework 96
7.2 Present-Value Constraint 98
7.3 Fiscal Policy in the Life-Cycle Model 99
7.4 Three Ways to Finance a War 101
7.4.1 Analysis of Policy 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
7.4.2 Analysis of Policy 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
7.4.3 Analysis of Policy 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
7.4.4 Why Are Policy 1 and 3 Equivalent? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
7.5 Multipliers 102
7.5.1 Empirics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.5.2 Theoretical Models for Multipliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.6 Social Security Systems 106
7.6.1 Social Security: Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
7.6.2 Social Security: Some US Facts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
7.7 Overview 109
7.8 Key Concepts 110
1. Introduction

1.1 Some Questions in Macroeconomics


To get an idea of what macroeconomics is about, it is useful to list some of the questions that
macroeconomists have traditionally tried to answer:

1. What explains growth in gross domestic product (GDP) per capita?


2. What explains business-cycle fluctuations?
3. If the government provides a temporary tax cut, then what will happen to GDP?
4. Should the government try to smooth out business-cycle fluctuations?
A first take on macroeconomics questions is that some are positive and some are normative.
Specifically, questions 1-3 are positive in that they ask one to explain what is or what will be. While
answering these questions will require some theoretical or statistical framework and data, they do
not necessarily require a system of values.
In contrast, question 4 is normative. An answer will rely at least implicitly upon a system of
values. To see this, suppose that two economists agree on how a particular proposal to smooth out
aggregate fluctuations will impact the economy. Even if they agree on this as a matter of positive
economics they could still disagree on the issue of staying with the status quo or adopting the
policy proposal. The reason is simply that they could still value the agreed-upon consequences
quite differently. For example, one economist may put more weight on the proposal’s impact on the
welfare of low income households over high income households compared to the other economist.
For this reason, progress on positive questions from better theory or better data make reduce but
not eliminate differences on normative issues.

1.2 The Role of General Equilibrium Theory


A second take on macroeconomics questions is that they require a theoretical framework that is
capable of addressing how the economy as a whole works. This is because the questions do not deal
only with a small part of the economy. For this reason, partial equilibrium methods, taught in your
8 Chapter 1. Introduction

first microeconomics course, will be of doubtful relevance.1 Instead, general equilibrium methods
will be key. Such methods determine how all relevant variables (i.e. prices and quantities of all
goods and services) are simultaneously determined. This is one reason why macroeconomics is
difficult. The growth model that we will use in this book is an example of such a general equilibrium
model.

1.3 Microfoundations
The approach to answering these questions in this book will be almost entirely based on microe-
conomic principles. The book analyzes theoretical model economies populated by individual
consumers that maximize utility and firms that maximize profit. Like in microeconomics, con-
sumers pick the best choices that are within their budget sets and firms choose production inputs to
maximize profit. An implication of the assumption that consumers are making best choices is that
persuasive arguments for how government policy can produce welfare gains will have to be more
subtle than a layman may at first appreciate. An argument cannot be based on consumers making
suboptimal choices within their budget sets and the government simply pointing this out.

1.4 Expectations
We will also make the assumption that consumers and firms are forward looking as they formulate
their best choices. Consumers are forward looking in that they understand the possible shocks that
may impact the economy and how these shocks may impact the economic variables that matter for
them (for example wages and interest rates). This assumption is called rational expectations in the
macroeconomic literature.
The information about the future that the agents possess can be key. As we will see later on,
the response of such forward looking consumers to shocks anticipated to affect the economy in
a temporary way can differ greatly from the response to shocks of the same magnitude that are
anticipated to affect the economy permanently. Also, the assumption of forward looking agents
will be especially important in contemplating the effects of alternative government policies. The
analysis will be based on the assumption that consumers understand the effects of new policies
and make reasoned decisions based on how the world works under the new policies. The forward
looking hypothesis is potentially highly relevant for policy evaluation because new changes in
policy rules may affect the future economy in ways that are very hard to predict based only on
past experiences when this past experience does not include the type of policy rule variation being
considered.

1.5 Why adopt a Growth Theory Approach?


A long-standing view in economics is that if one had a good theory of long-run economic growth,
then such a framework would be a useful starting point for addressing a wide range of macroeco-
nomic questions as well as questions considered in labor economics, public economics and other
fields. Growth theory deals with the fundmental reasons for why some countries grow over time and
why some countries are rich whereas others are poor. The structure of this book reflects this view
1 Forexample, one might think that partial equilibrium methods could be useful for understanding how a freeze in
Florida may impact the price of orange juice. This makes sense if near-term supply depends mainly on price, weather
and the stock of orange trees, while demand depends on price but does not depend much on the weather in Florida. If
developments in the orange juice market have a negligible impact on demand side variables such as preferences, national
income and the prices of other goods, then the change in the price of orange juice might usefully be analyzed holding all
items other than weather fixed. This is a verbal description of the basis for fixing a demand curve while shifting the
supply curve. This approach is unlikely to be useful for addressing questions about the aggregate economy precisely
because macroeconomic shocks and policies are likely to impact many of the determinants of supply and demand.
1.5 Why adopt a Growth Theory Approach? 9

on the centrality of growth theory. Thus, the book starts out by developing a theory of long-run
growth. The growth models used in this book are variants of the neoclassical growth model.
2. Measurement of Output and Prices

What is the output of the economy and how is it used? The US National Income and Product
Accounts (NIPA) are a conceptual framework for organizing data on the production of goods and
services and data on the incomes received by factors of production. The origins of this framework
date back to the 1930’s, when Nobel Laureate to-be Simon Kuznets was commissioned by the US
Congress to develop an accounting framework for national income. His first calculations showed
that national income had dropped more than 50 percent between 1929 and 1932.1 We now give a
brief sketch of the NIPA accounting framework.2

2.1 Output and Income Accounting


We start out with the concept of nominal Gross Domestic Product (GDP). Nominal GDP can be
defined as the total value of all final goods and services produced domestically at current-year
prices. We will often be more interested in the concept of real GDP. We define real GDP as the
total value of all final goods and services produced domestically at base-year prices.
We will now develop three accounting approaches to compute nominal GDP, which is denoted
Y below. Thus, we need to develop some ways of adding observable stuff up so that the sum equals
GDP. In what follows ∑i denotes the operation of summation over all the items indexed by the
symbol i.3

Definition 2.1.1 — Final Sales (or Expenditure) Method. Denoting the price of final good g
as pg and the final quantity produced of good g as yg , then measured GDP, using the Final Sales

1 See Kuznets (1934), “National Income 1929-1932”, NBER, June 1934.


2 NIPA accounts are largely consistent with the System of National Accounts (SNA) of the United Nations, which
is applied by many countries. The NIPA are an important component of the US economic accounts, together with the
industry accounts (also known as input-output accounts), the financial accounts (also known as flow of funds accounts)
and the international accounts (balance of payments).
3 For example, if i indexes four items i = 1, 2, 3, 4 then the symbol
∑i xi is a short-hand notation for the sum
x1 + x2 + x3 + x4 . In short, ∑i xi = x1 + x2 + x3 + x4 .
12 Chapter 2. Measurement of Output and Prices

Method, is given by:

Y = ∑ pg yg (2.1)
g

This method involves first getting a list of all the final goods. Imagine that these are numbered
so that g indicates the number of the good. The method states that we figure out the total expenditure
pg yg on final good g and then add up these expenditures across all the goods on this comprehensive
list. For this method to work we do not need to observe how much is the final goods output yg
and the price pg of good g but just the product of these which is the expenditure. In some parts of
the economy prices and physical quantities are easily determined. For example, for oil producers
output can be measured in units of barrels of oil of a given quality grade and prices are stated in
dollars per barrel. However, in the legal services sector one can observe the total expenditure on
legal services but the units in which the output of these legal services can be measured is unclear.

Definition 2.1.2 — Value Added Method. Denoting the value added by firm f as VA f , then
measured GDP, using the Value Added method, is given by:

Y = ∑ VA f (2.2)
f

Where value added is measured as VA f = Sales f − Intermediate Goods Purchased f

The Value Added method involves creating a list of all the firms in the economy. We can index
a firm by its number f on this list. The method then computes the value added of each firm and
adds the value added of all the different firms together. In the simple case in which a firm produces
just a single good, then the dollar value of the sales of the firm could be thought of as equaling
p f y f , the product of the price of the good and the quantity produced. The term Intermediate
Goods Purchased f in the formula above stands for the value of all the intermediate goods that are
purchased by firm f .
An intermediate good is a good produced by a firm that is sold to another firm to be used in
production. Thus, an intermediate good does not leave the firm sector. A final good is a good
produced by a firm and sold directly to a household. An example of an intermediate good is the
corn which is produced by a farmer and sold to Kellogg’s to be converted into Kellogg’s Corn
Flakes. The Corn Flakes produced and sold to households counts as a final good. Any corn which
is produced by a farmer and sold directly to households is considered to be a final good. Thus,
some part of the total production of corn enters the calculation as an intermediate good and some
part enters the calculation as a final good.

Definition 2.1.3 — Factor Income Method. Measured GDP, according to the Factor Income
Method is the sum of all payments to national factors of production, depreciation and indirect
business taxes minus the net foreign factor income payments.

Y = 1+2+3 (2.3)
1 = Wages + Proprietor’s Income + Corporate Profit + Interest + Rent (2.4)
2 = Indirect Taxes - Net Foreign Factor Income (2.5)
3 = Depreciation (2.6)

Note: In national accounts, 1 is called National Income and 1 + 2 is called Net Domestic Product.

The Factor Income method is not as easily stated or as easily explained as the other methods.
However, the basic idea behind this method is not complicated. The basic idea is that all the value
2.1 Output and Income Accounting 13

of the final production of the firm sector of the economy has to be paid out to the owners of the
capital and labor (factors of production) that produce this output. Thus, instead of keeping track of
the value of final goods produced we could just count up all the incomes (factor payments) paid to
factors of production.

FACTOR PAYMENTS

CAPITAL AND LABOR


INTERMEDIATE GOODS

FIRMS HOUSEHOLDS

FINAL GOODS

EXPENDITURES

Figure 2.1: Expenditure and Factor Income Method: A Plumber’s Diagram

This simple idea is often illustrated using the Plumber’s Diagram in Figure 2.1. The intuition is
that the payments made by households to firms in exchange for final goods can be viewed as one
flow of water through a closed system of pipes and, in the absence of any leakages, this amount of
water must also flow from firms to households. Households supply the labor and capital services
that firms use to produce final goods.
The next step is to try to make this idea work in practice. First, make a list of distinct types
of payments to factors of production. This would include (i) wages paid to employees, (ii) all the
income paid to sole proprietors: those who are self-employed, (iii) all the corporate profit by firms
organized as corporations, (iv) the net interest paid by firms to lenders and (v) the rent paid by firms
or individuals for using capital (e.g. using a building or using machinery for a period of time). All
of these items are contained in the National Income component of GDP.
One important leakage is due to government. Specifically, governments sometimes take away
some part of a firm’s income before the firm has had a chance to pay out this income to labor and
capital. Sales taxes in the United States are an example of this. The national accounts add this
leakage back so that both the income and expenditure approach will count the same thing. This
leakage is labeled Indirect Taxes in the Income approach. Thus, it is apparent that in practice Figure
2.1 needs to be modified to capture this important leakage.
Depreciation is another important complication that must be addressed. Here the issue is not
whether or not physical capital wears out over time or by use. Instead, the Depreciation term is
added back into the GDP formula for the factor income method simply because income accountants
end up using data on corporate profits and proprietors income that are calculated after subtracting an
accounting measure of depreciation allowed by tax laws. For example, in a stylized calculation of
corporate profits, a corporation starts with total revenue and then subtracts wages and depreciation
to get to corporate profits. The upshot is that the sum of wages and corporate profit does not
equal the corporations’ revenue. One would be missing the "depreciation" subtracted. Thus, the
14 Chapter 2. Measurement of Output and Prices

depreciation calculated by the accountants based on tax law needs to be tacked back on so that all
the revenue of the firm is accounted for in payments to owners of the factor inputs used by the firm.
This accounts for why a Depreciation term is added in term 3 of the formula for the Factor Income
method.

2.2 Expenditure vs. Value Added Method


So far we have not explained why the expenditure and value-added methods should produce the
same value for GDP. In essence, the value added approach must produce the same number as the
expenditure approach because the value of intermediate goods production enters positively in a
calculation of one firm’s value added but enters negatively, and of equal value, in the calculation of
value added of other firms. Thus, the value added approach amounts to a tricky way of counting
only the value of the final goods and services produced. Of course, this is exactly what needs to be
true if both approaches are equivalent.

2.3 GDP vs. GNP


Gross National Product (GNP) used to be the standard measure of income highlighted in newspapers
and in government policy discussions in the United States prior to 1991, when it was replaced with
GDP. Nominal GNP is defined as the total value of income paid to all nationally owned factors of
production in a period of time at current prices. Unlike GDP, GNP is not a geographic concept.
It adds together income paid to nationally owned factors of production regardless of where in the
world they are located. GDP is based on geographic borders of a country as it focuses on the
income paid to domestically located factors of production or, alternatively, the value of all final
goods produced domestically. Given the definition above, it is clear that GNP equals GDP plus
net foreign factor income. The term net foreign factor income adds in all the wage payments and
capital payments to labor and capital owned by U.S. nationals located abroad but subtracts the
wage and capital payments to foreign labor or foreign owned capital located within the geographic
confines of the United States. For the US economy, GDP and GNP do not differ dramatically.

2.4 GDP in Simple Example Economies


We now calculate GDP in simple example economies. The example economies are highly stylized
so as to highlight as clearly as possible how GDP accounting handles three important issues:
intermediate goods, depreciation and the distinction between GDP and GNP.
 Example 2.1 — Highlight Intermediate Goods. A Farmer produces y1 = 10 wheat with labor.
A Miller produces y2 = 10 flour with labor and 10 wheat. A Baker produces y3 = 10 bread with
labor and 10 flour. Price Data: p1 = 1, p2 = 2, p3 = 4 per unit of each good

Three methods are employed to compute GDP:

Y = ∑ pg yg = p1 y1 + p2 y2 + p3 y3 = 1 × 0 + 2 × 0 + 4 × 10 = 40 (2.7)
g

Y = ∑ VA f = VA1 +VA2 +VA3 = 10 + (20 − 10) + (40 − 20) = 40 (2.8)


f

Y = Wages + Profit + Proprietor’s Income = 0 + 0 + 40 = 40 (2.9)


2.4 GDP in Simple Example Economies 15

To apply the expenditure approach it is critical to keep in mind the distinction between final
goods production, intermediate goods production and the total production of a good. In this
example, the total production of wheat is 10 units but all of this is sold to the Miller and converted
into flour. Thus, of the 10 units of total production of wheat 10 counts as intermediate goods
production and 0 counts as final goods production. Recall that the production of some amount of a
good is considered to be final goods production if it is sold directly to households. The production
of some amount of a good which is sold to another goods producer is considered to be intermediate
goods production. It is useful to look again at the Plumbing Diagram in Figure 2.1 to see this
distinction.
To apply the value added approach, we simply calculate the value added for each firm. In this
example, the Farmer, Miller and Baker are considered to be different firms. The value added of
each firm is measured in a common unit of account which is taken to be dollars in this example.
One could use a different but common unit of account, such as bread, if one wanted to do so. There
is nothing wrong with doing GDP accounting in a different unit of account.
To apply the income approach, we need to make some assumption on how the Farmer, Miller
and Baker are organized. They could be organized as sole proprietors or as firms which pay out
wages and profits. The text of this example did not provide this information. If we assume that
each is a sole proprietor then the income of each is simply value added in which case the Farmer’s
and the Miller’s income are both 10, whereas the Baker’s income is 20. This is the assumption used
above.
 Example 2.2 — Highlight the Treatment of Depreciation. Consider a world with two firms
that use capital and labor to produce two types of goods. Firm 1 produces 20 dollars of a consump-
tion good using labor and capital and firm 2 produces 10 dollars of a capital good using labor. The
output, wages, profit and depreciation for these two firms are stated below.

Profit1 = output1 − wages1 − depreciation1 = 20 − 10 − 5 = 5 (2.10)


Profit2 = output2 − wages2 − depreciation2 = 10 − 10 − 0 = 0 (2.11)

Three methods are employed to compute GDP:

Y = C + I + G = 20 + 10 + 0 = 30 (2.12)
Y = ∑ VA f = VA1 +VA2 = (20 − 0) + (10 − 0) = 30 (2.13)
f

Y = Wages + Corporate Profit + Depreciation = 20 + 5 + 5 = 30 (2.14)

The accounting equation Y = C + I + G + NX, commonly taught in introductory courses, is


simply a version of the expenditure approach. When we use this equation (rather than Y = ∑g pg yg )
we are grouping the expenditure on final goods and services in categories: C for consumption
expenditures, I for investment expenditures and G for government spending and NX for net exports.
Clearly, the government purchases of goods and services that enter the G term above do not include
government transfer payments (e.g. government social security checks). The government purchases
G that enter the NIPA accounts in the equation Y = C + I + G + NX reflect expenditures on goods
and services such the expenditure on elementary school education by local governments. Many
of the simple example economies analyzed in this book, including example 2.2, will not have a
foreign sector and, thus, net exports NX will be zero.
If one reflects on example 2.2 it is clear that the calculation of GDP is not sensitive to how
depreciation is calculated. The basic idea of the income approach is that as long as the value of
16 Chapter 2. Measurement of Output and Prices

the output of all firms is paid to factors of production (i.e. owners of capital and labor), then the
sum of factor payments must equal the value of this final output. To clarify this point, suppose that
the corporate accountants (or the corporate tax laws) change the rules for calculating depreciation.
Suppose to be concrete that depreciation for firm 1 in example 2 is now 10 rather than 5. GDP
computed using the income method will still be 20. The reason is that corporate profits shrink by 5
but depreciation grows by 5. GDP, as measured by the income approach, is unchanged.
 Example 2.3 — Highlight GDP vs GNP. A small country produces 10 in vaction services using
capital and labor. Labor is paid 5 and capital is paid 5 in rent. The countries’ nationals do not own
any capital and the only source of labor income is based on work within the country.
What is GDP and GNP? To calculate GDP we use the factor incomes approach as there is
information on factor income payments. To calculate GNP we use the definition, discussed in
section 2.3, that GNP is GDP plus net foreign factor income.

GDP = wages + rent = 5 + 5 = 10 (2.15)


GNP = GDP + net foreign factor income = 10 − 5 (2.16)

 Example 2.4 — Highlight Intermediate Goods and Productivity Growth. A farmer can
produce 1 unit of corn for each unit of labor. A restauranteur can produce 1 unit of corn pancakes
using one unit of corn and one unit of labor. In period 1 the farmer produces 10 units of corn using
10 units of labor. In period 1 the resturant produces 10 units of pancakes using 10 units of corn and
10 units of labor. The period 1 price of corn and pancakes are (pcorn , p pancakes ) = (1, 2).
In period 2 the productivity improves. The farmer can now produce 2 units of corn per unit of
labor and the restaurant can produce 2 units of pancakes for each unit of corn and unit of labor it
uses. In period 2 the farmer produces 40/3 units of corn using 20/3 units of labor. In period 2 the
restaurant produces 80/3 units of pancakes using 40/3 units of corn and 40/3 units of labor. The
total labor used in the economy is 20 units in both period 1 and 2, although its allocation across
firms differs across periods.
What is real GDP in each period using period 1 prices? The answer is determined by applying
the expenditure approach. A subscript on GDP denotes the period in which GDP is measured. Real
GDP more than doubles across periods even though the production technology for producing corn
or for producing pancakes each double.

GDP1 = pcorn ycorn + p pancakes y pancakes = 1 × 0 + 2 × 10 = 20 (2.17)


.
GDP2 = pcorn ycorn + p pancakes y pancakes = 1 × 0 + 2 × 80/3 = 160/3 = 53.3 (2.18)

2.5 US NIPA Facts


Table 2.1 highlights the results of applying the expenditure method to US data.
Consumption expenditures are by far the largest expenditure component of GDP. Consumption
expenditures consist of purchases of non-durable goods and services as well as durable goods. A
refrigerator is an example of a durable good whereas an apple is an example of a non-durable good.
Legal advice is example of a service. Investment expenditures consist of purchases of residential
and non-residential structures and equipment purchases among other categories. Government
expenditures are purchases of goods and services. These can be broken down into state and local
2.5 US NIPA Facts 17

Table 2.1: US Nominal GDP in 2016: Expenditure Components

GDP (Y) 18,569.1 billion


Consumption (C) 12, 757.9 billion
Investment (I) 3,035.7 billion
Government (G) 3,276.7 billion
Net Exports (NX) - 501.3 billion
Source: Bureau of Economic Analysis Table 1.1.5

Table 2.2: US Nominal GDP in 2016: Factor Income Components

GDP (Y) 18,569.1 billion


Compensation of Employees 10,101.3 billion
Proprietor’s Income 1,417.5 billion
Rent 704.7 billion
Corporate Profit 2,088.1 billion
Net Interest 524.1 billion
Tax on Production 1,237.6 billion
Depreciation 2,910.4 billion
Source: Bureau of Economic Analysis Table 1.12 and Table 1.7.5

government expenditures and federal government expeditures. A key component of state and local
government expendtures are expenditures on public education.
Table 2.2 highlights the results of applying the factor income method to US data. The sum of the
first five components is called National Income. Tax on Production and Indirect Tax are synonyms.
Earlier we used the term Indirect Tax. These taxes include property taxes, sales taxes and excise
taxes among other taxes in this category. Adding National Income and Tax on Production together
we get Net Domestic Product. Adding Net Domestic Product and Depreciation together we finally
get to GDP.4
Figure 2.2 graphs the behavior of real GDP and expenditure components over time. The vertical
scale is in log units. Log units are useful when a series grows over time. Specifically, if a series
has a constant positive growth rate then when the series is ploted in log units it will be a straight
line with a positive slope.5 The greater the slope is the larger is the growth rate of the series being
plotted. One key property of Figure 2.2 is that, aside from the Great Depression, WWII and some
large recessions, US real GDP grows at a fairly constant rate over much of the last 100 years. Thus,
an important issue in macroeconomics is what explains such a sustained positive growth rate? The
chapter on growth theory will address this issue.
Two other properties stand out in Figure 2.2. First, consumption parallels GDP so that the
long-run growth rate of aggregate consumption expenditures is similar to that of GDP. Second, it is
apparent that the year-to-year fluctuations in investment expenditures are much more volatile in
percentage terms than are the corresponding fluctuations in consumption expenditures or GDP. Thus,
a major question in business-cycle theory is what accounts for the year-to-year or quarter-to-quarter
4 In practice, there are a few additional terms in the Bureau of Economic Analysis tables for the factor incomes
approach that we will not discuss. One of these is called Statistical Discrepancy.
5 Here is the math. Assume that the variable y grows at a constant growth rate g so that y = y (1 + g)t . Take logs of
t t 0
both sides to get log yt = log(y0 (1 + g)t ) = log y0 + t log(1 + g). Thus, the slope will be log(1 + g) so that the slope is
larger the larger is the growth rate g.
18 Chapter 2. Measurement of Output and Prices

Figure 2.2: Real GDP and Components

US Real GDP and Components
10000
Log Scale (2009 Dollars)

1000

100

10
1920 1940 1960 1980 2000 2020

GDP Consumption Investment Government

fluctuations in GDP and why are investment fluctuations so volatile in percentage terms compared
to consumption and GDP? Thus, business-cycle theory is concerned with the patterns in the high
frequency wiggles in Figure 2.2 rather than the magnitude of the slopes of the long-run trends of
these aggregates. The chapter on business-cycle fluctuations will offer competing theories for the
fundamental sources of these fluctuations.
Figure 2.3 graphs the behavior of labor’s share of income in the US. The calculation is based
on the factor incomes method for calculating GDP highlighted in Table 2.2. Labor’s share is not
simple to calculate because some of the items (e.g. Tax on Production and Proprietor’s Income)
calculated in the BEA Tables are not clearly a payment to labor input or a payment to capital inputs.
Proprietor’s Income is all of the income of businesses organized as sole proprietors and, therefore,
reflects both the payments to the business owner’s labor and capital. Tax on Production (e.g. sales
taxes) are taxes that are paid by a firm before the firm has the chance to pay firm revenues to capital
and labor. Thus, it is unclear how to split Tax on Production between capital and labor payments.
With these issues in mind, we calculate labor’s share as the ratio of the Compensation of Employees
to the sum of all of the subcomponents of GDP listed in Table 2.2 except Tax on Production and
Proprietor’s Income. Figure 2.3 graphs the results of this calculation.
Economists have long been interested in patterns of how GDP is divided between households
and how GDP is divided between classes of factor inputs. Developing a theory for how GDP is
divided between labor and capital income (the so called functional distribution of income) is a
problem with a long history. David Ricardo viewed this issue as the "principal problem in political
economy". One might guess that a large decline in labor’s share may lead to revolution in some
countries or a large exodus of workers to other countries. Karl Marx famously made the opposite
2.6 Comparing GDP Across Countries 19

Figure 2.3: Labor’s Share

Labor's Share: US 1929‐2015
1

0.8

0.6

0.4

0.2

0
1920 1940 1960 1980 2000 2020

Labor's Share

conjecture. He thought that the “tendency of the rate of profit to fall” would lead to a crisis in
capitalism.
Figure 3 shows that in US data labor’s share of income has been fairly stable for roughly the
last 100 years and has averaged 65 percent over this period. This also implies that capital’s share of
income (i.e. 1 minus labor’s share) has averaged rougly 35 percent.
However, in the last 15 years or so labor’s share has fallen in the US. This has lead to a
substantial amount of new empirical work. Some authors have documented that the decline in
labor’s share in the last few decades has occured in many advanced economies.6 This fact would
seem to suggest that forces which are impacting all of these countries, such as technological change,
outsourcing and the integration of China into the world economy, might be natural as candidate
explanations.

2.6 Comparing GDP Across Countries


Suppose one wants to compare US GDP to the GDP of other countries. This can be done in several
ways. The conversion can be done using a currency exchange rate, by using a Purchase Power
Parity (PPP) exchange rate or by computing GDP using a common set of world relative prices.
Below we highlight the exchange rate comparison (Method 1) versus two other methods.
Method 2 is based on a PPP exchange rate and works in two steps. First, compute GDP in each
country in each countries currency. Then define a common basket of goods commonly available in
6 See Karabarbounis and Neiman (2014), The Global Decline of the Labor Share, Quarterly Journal of Economics,

129, 61-103.
20 Chapter 2. Measurement of Output and Prices

all countries. Denote the basket by quantities (x1 , x2 , ..., xn ) of the n distinct goods. The method
consists of computing the cost of the basket in each country and then taking the ratio. The Economist
magazine regularly provides PPP conversion rates based on a basket consisting only of one Big
Mac hamburger. Given this choice of a common basket, the PPP conversion rate equals the US cost
of the basket as a ratio to the Indian cost of the basket.
Method 3 is a very different comparison that is based on calculating GDP across countries using
a common set of “world relative prices”. There is a literature on how to weight country specific
relative prices to get the world relative prices for pairs of goods. We will not get into the merits of
different schemes to assign these weights. A large empirical literature in economics is based on a
longstanding research project to improve international comparisons by applying Method 3.

Definition 2.6.1 — Exchange Rate Method. Compare GDPUS to eGDPIndia , where e is the
exchange rate in units of Dollars per Indian Currency

Definition 2.6.2 — PPP Exchange Rate e∗ Method. Compare GDPUS to e∗ GDPIndia , where
e∗ = ∑g pUS India x is PPP exchange rate
g xg / ∑g pg g

Definition 2.6.3 — World Relative Price Method. Compare GDPUS = ∑g pg yUS


g to GDP
India =

∑g pg yIndia
g , where pg is the “world relative price” of good g.
Comment: You can download GDP calculated in world relative prices from the Penn World
Tables website (http://pwt.econ.upenn.edu/) . This is the standard data source that economists
use to make cross-country GDP comparisons.

If one compares GDP across countries using the Exchange Rate method then a typical finding
is that the ratio of GDP in a relatively poor country (e.g. India) to US GDP is much smaller
compared to calculating the same ratio using the World Relative Price method. The economic size
of developing countries in the world is much larger when comparisons are made using method 3.
One reason for this is that the relative price of internationally non-tradable goods (e.g. haircuts
and housing) to tradable goods (e.g. corn and computers) is much lower in poor countries and
non-tradable goods are an important component of GDP.

2.7 Price Indexes


Price indexes help track how the purchasing power of money changes over time. Two standard
price indexes are presented below. The first is a “fixed basket” index. The standard example of this
type of index is the Consumer Price Index (CPI), which is widely reported on in the press. In this
type of price index one constructs a weighted average of the price of goods over time where the
“weights” one uses are the fixed quantities xg of the different goods g in the basket. We normalize
the CPI by dividing by ∑g p∗g xg , which is the cost of the basket in a base year.

Definition 2.7.1 — Fixed Basket Price Index:. Let


xg : be the quantity of good g in the basket of goods.
pgt : be the price of good g at time t.
p∗g : be the price of good g in the base year.
Then the Fixed-Basket Consumer Price Index is defined as follows:

∑g pgt xg
CPIt =
∑g p∗g xg
2.8 Cost-of-Living Index 21

The numerator of the CPI is the cost of the basket in year t, whereas the denominator is the cost
of the same basket in the base year. Note that as defined above the CPI is equal to 1.0 in the base
year. In some textbooks, the CPI as defined above is multiplied by 100 so that the index equals
100 in the base year rather than 1. This type of index is sometimes referred to as a Laspeyres
Index.
The second price index is a “time-varying basket” index. The standard example of this type of
index is known as the GDP Deflator. This type of price index is also a weighted average of prices.
In this case the weights ygt are the quantities of the different final goods produced in year t. From a
mathematical point of view, the key difference between the two indexes is that in one the weights
do not change but in the other the weights change over time.

Definition 2.7.2 — The Time-Varying Basket Price Index. Let


ygt : be the quantity of final good g in the basket in year t.
pgt : be the price of good g at time t.
p∗i : be the price of good g in the base year.
Then the Time-Varying Basket Price Index (GDP deflator) is defined as follows:

∑g pgt ygt
deflatort =
∑g p∗g ygt
The numerator of the GDP Deflator is nominal GDP in year t. This follows if the weights
ygt are the quantities of final good g produced in year t. The denominator is real GDP in year t.
The index thus tells one the cost of buying up current year final output in current years prices
compared to buying it up in base year prices. This type of index is sometimes referred to as a
Paasche index.
Why is it important to measure price indexes such as the CPI accurately? Here are some
standard answers:
1. The CPI is used to index social security retirement payments. Thus, any systematic bias will
be compounded year after year. If the CPI is biased upwards as a measure of the “cost of
living” in the sense that the CPI tends to grow faster than a true cost of living index, then
social security payments will effectively bear interest. This can turn out to be in aggregate
financial terms a very big deal simply because of the logic of compound interest.
2. One of the two mandates of the US Federal Reserve Bank is to keep inflation low. Therefore
biases in the price index can influence monetary policy.
3. The federal income tax code in the United States links tax brackets to the CPI. Thus, absent a
change in legislation, if the inflation rate is 10 percent then the level of income at which a
given tax rate applies is also raised by 10 percent. Any systematic bias in measured inflation
can increase or decrease real tax revenue as inflation occurs.
4. The price data collected by the Bureau of Labor Statistics in the U.S. is used to compute GDP.
Specifically, one can figure out the quantity of the different final goods that are produced
each year by dividing total expenditures on these goods by prices. If the prices are too high,
then the quantities produced, which one infers from price and expenditure data, are too low.
This can be important in computing GDP growth rates. Suppose it is the case that year by
year the calculated inflation rate in a specific good is too high in the sense of higher than true.
Then the growth rate of real GDP (using base year prices) will be too low.

2.8 Cost-of-Living Index


One of the important uses of actual price indexes is as an empirical measure of the “cost of living".
To an economist this term means something quite different from what a layman might think that
22 Chapter 2. Measurement of Output and Prices

it means. To an economist, a cost-of-living index is a theoretical concept. A cost-of-living index


measures the minimum cost of achieving a fixed level of utility over time as prices change. Thus,
the “cost of living" is well defined within consumer theory. Within the theory, (i) consumers
have preferences over different bundles of goods, (ii) preference rankings among bundles can
be represented by a utility function, (iii) the utility function does not change over time even
though goods prices and consumer incomes may be changing and (iv) the consumer picks the best
consumption bundle which is in the budget constraint, defined by prices and consumer income.
This is standard vanilla consumer theory from introductory-level economics.
Figure 2.4 illustrates this idea in the case where there are only two goods: yoga lessons and
sports tickets. Here the consumer chooses a point (x1∗ , x2∗ ) which is a best choice, given prices
and income. Note that the consumer’s indifference curve passing through (x1∗ , x2∗ ) is tangent to the
budget line which also passes through (x1∗ , x2∗ ). Tangency reflects the idea that any bundle giving
higher utility (located northeast of (x1∗ , x2∗ )) is not in the budget set - not affordable given prices and
consumer income.

Figure 2.4: Consumer Choice with 2 Goods


GAME TICKETS

x 2*
INDIFFERENCE
CURVE
BU
D
G
ET
LI
N
E

x1* YOGA LESSONS

Now suppose that the people in charge of computing the consumer price index (CPI) go out and
observe the actual year 1 consumption choices (x1∗ , x2∗ ) and corresponding (base year) prices (p∗1 , p∗2 ).
These consumption choices are now assumed to be the basis for the fixed weights, discussed in the
last section, for calculating the CPI. Furthermore, now suppose that in year 2 the CPI folk observe
new prices (p1,2 , p2,2 ) that differ from base year prices. They could then compute the CPI2 for year
2 as follows:

p12 x1∗ + p22 x2∗


CPI2 =
p∗1 x1∗ + p∗2 x2∗
Lastly, let us suppose that the consumers in this world are simply collecting social security
retirement benefit checks, issued by the U.S. government, as their sole source of income. Social
security in this world ties the value of these checks to the CPI. Thus, if the CPI goes up, then the
amount of dollars on the check goes up proportionally to the CPI measure of the increase in the
“cost of living".
2.8 Cost-of-Living Index 23

The interesting question is then to ask whether using the CPI in this way over compensates,
under compensates or correctly compensates these retirees for changes in the cost of living. Thus,
does the CPI serve as a cost-of-living index as we defined it above? The answer is NO. Specifically,
using the CPI in this way will in theory sometimes over compensate in the sense that it gives too
much money to the retirees so that the retirees will be able to get strictly more utility in year 2 than
in year 1.
This over compensation always occurs when two conditions hold. The first condition is that
p∗1
between year 1 and year 2 there is a change in relative prices so that pp12 22
6
= p∗2 . The second condition
∗ ∗
is that the indifference curve through the original point (x1 , x2 ) is “smooth" in that there is no kink
so that the indifference curve has a unique tangent line at this point. As long as both these hold,
then the new budget line in year 2 will still run through the point (x1∗ , x2∗ ) but will cut through the old
indifference curve. The upshot is that because the new budget line cuts through the old indifference
curve there will be a better consumption choice in year 2 than the choice (x1∗ , x2∗ ) that was optimal
in year 1 at year 1 prices and incomes.7

Figure 2.5: Consumer Choice with Cost of Living Indexation and Relative Price Change
GAME TICKETS

INDIFFERENCE CURVE
IN BASE YEAR

BUDGET LINE
IN YEAR 2
x 2*

x1* YOGA LESSONS

Figure 2.5 illustrates this idea. The solid budget line represents the consumption possibilities in
the base year. The dotted budget line represents the consumption possibilities in year 2 after two
things happen (1) prices change (2) the retiree’s income is adjusted using a cost of living index, so
the initial basket (x1∗ , x2∗ ) can be afforded at the new prices. The change in the budget line from the
base year to year 2 in Figure 2.5 is consistent with a larger proportional increase of the price of
game tickets compared to the price of yoga lessons. This implies a fall in the relative price of yoga
lessons. Thus, the new budget line is flatter and consuming a bit more in yoga lessons and a bit less
in game tickets would be a way to increase utility. This is illustrated by moving from point (x1∗ , x2∗ )
to any point along the segment of the dotted line in Figure 2.5 that lies above the indifference curve
attained in the base year.
7 There is nothing special about the choice of illustrating this theoretical point in the case of exactly two consumption
goods. If there are three goods, then the idea is that there is always over compensation when the plane describing the
“budget line" cuts through the indifference surface at best point (x1∗ , x2∗ , x3∗ ) when prices change. With more than three
goods the same ideas apply but visualization is difficult.
24 Chapter 2. Measurement of Output and Prices

The economics of this over compensation in using a fixed weight price index such as the CPI
as a cost-of-living index has been understood at a theoretical level for well over half a century.
There have been several literature surveys which have discussed the likely empirical magnitude
of the annual over compensation due to the “substitution effect" highlighted in this section.8 The
Congressional Budget Office (1994) suggested that the overcompensation was between 0.2 and 0.8
percent per year.9

2.9 Key Concepts


Intermediate good is a good which is produced but sold to another producer and embodied
in some other (intermediate or final) good.
Final good is a good which is produced and then sold to the household sector.
Real GDP is the value of all final goods and services produced domestically over a period of
time, where the value is measured using base-year prices.
Nominal GDP is the value of all final goods and services produced domestically over a
period of time, where the value is measured using current-year prices.
Value Added equals the sales of a firm less the cost of the intermediate goods purchased by
that firm.
Cost-of-living Index measures the minimum cost of achieving a fixed level of utility over
time as prices change.

8 See
Moulton (1996), Journal of Economic Perspectives, vol. 10, 159- 77 for a discussion of (1) details of how the
CPI is computed in the U.S., (2) plausible magnitudes of any bias and (3) what statistical agencies were doing to allow
the CPI to more accurately mimic a cost-of-living index.
9 Congressional Budget Office (1994), Is the Growth of the CPI a Biased Measure of Changes in the Cost of Living?,

Washington DC.
3. Growth Theory

3.1 Introduction
This chapter describes the basic elements of the theory of economic growth that were laid out in
the work of Robert Solow. The growth theory that practitioners of economics use to this day is
influenced by this work. Solow received the Nobel prize in economics in 1987 for two important
papers on economic growth.1 Solow’s first paper provides a model of economic growth that can
explain some of Nicholas Kaldor’s growth facts. Kaldor’s facts consist of six empirical regularities
about economic growth. Solow’s second paper provides a method to decompose observed output
growth into the growth of inputs and the growth of technology. This chapter discusses each of these
contributions. We start out by discussing some useful properties of production functions and the
theory of profit maximization when firms take prices as given.

3.2 Production Function


A key abstraction used in growth theory is that there exists an aggregate production law that relates
the total amount of inputs of capital and labor used in the economy, in a given period, and the
amount of output produced. Growth theory is based on a production function with very specific
properties. Thus, we start the chapter by presenting the aggregate production function and reviewing
its key properties. The production function Yt = At F(Kt , Lt ) describes the output level Yt which can
be produced by using quantities of capital Kt and labor Lt . The variable At is a measure of the level
of technological sophistication of the economy at time t. Higher At represents better technology in
the sense that, with higher At , more output can be produced using the same amount of inputs.

Definition 3.2.1 We now define the Aggregate Production Function:

Yt = At F(Kt , Lt )

1 SeeSolow (1956), A Contribution to the Theory of Economic Growth, Quarterly Journal of Economics, 70, 65-94
and Solow (1957), Technical Change and the Aggregate Production Function, Review of Economics and Statistics, 39,
312-20.
26 Chapter 3. Growth Theory

Yt : output at time t
Kt : capital input at time t. The capital stock is a stock of capital goods that are devoted
to production. The capital input is a flow of production services from capital that is
proportional to the stock of capital.
Lt : labor input at time t. The labor input is the flow of labor services proportional to the
number of workers Lt .
At : technology level at time t

Standard properties of the production function are now discussed.

Constant Returns to Scale


A production function Y = AF(K, L) has constant returns to scale if it has the property that, when
all factor inputs are scaled up or down by a common factor, then output is also scaled by the same
factor. For example, when all factor inputs are doubled it must be true that output is also doubled.
Similarly, when all factor inputs are halved it must be true that output is also halved. This definition
can be described using a simple mathematical terminology. Below, λ is the factor by which all
inputs are scaled up or down.

Definition 3.2.2 A production function has constant returns to scale if

λY = AF(λ K, λ L), f or all λ > 0

The mathematical expression says that if both inputs are multiplied by any number λ greater
than zero, then output is also multiplied by λ .

We now discuss a key implication of production functions with constant returns to scale: When
the production function has constant returns to scale, the amount of output per unit of labor
depends exclusively on the amount of capital per unit of labor, and the level of technology. To
illustrate this, imagine that there are two countries with the same production function and the same
technological level A. Then, the “size” of the countries is irrelevant for determining which country
is richer in the sense of having a larger output per unit of labor. The only thing that is relevant is
which country has more capital per unit of labor, KL . This logic is expressed below in mathematical
terms. It follows from the definition of constant returns to scale when the scaling factor is set to
λ = L1 .
   
Y K L K
= AF , = AF ,1 .
L L L L
The concept of constant returns to scale describes what happens to output when all inputs are
increased by a common factor. We now describe a key property of the production function when
only one input is varied, keeping all other inputs constant.

Diminishing Marginal Products


The marginal product of an input is the additional output that would be obtained by a small
increase in that input.2 A production function Y = AF(K, L) has a diminishing marginal product of
capital if the marginal product of capital falls as the quantity of capital input is increased, holding
other inputs constant. Similarly, a production function Y = AF(K, L) has a diminishing marginal
product of labor provided that the marginal product of labor falls as the quantity of labor is increased,
holding other inputs constant. In what follows it will often be helpful to have some notation to
2 Mathematically, the marginal product is the slope of the production function as the relevant input increases. This

slope can be calculated by taking the derivative of output with respect to the relevant input.
3.2 Production Function 27

denote the marginal products of capital and labor. The notation adopted here is to use a subscript K
or L to denote that one is talking about the marginal product of capital or labor.3 Marginal products
are functions as a marginal product will depend on the quantities of inputs that are available.
AFK (K, L) : marginal product of capital
AFL (K, L) : marginal product of labor

Implications of Profit Maximization


In order to maximize profits, a firm must choose the amount of each input it uses so that the
marginal revenue obtained by adding a small amount of the input is equal to its marginal cost. This
is a powerful idea from microeconomics. Assume that the price of output in the economy is 1, the
price of renting one unit of labor for one period is W and the price of renting one unit of capital
for one period is R. Then the revenue of a firm is AF(K, L), which is just the production of the
firm. The labor cost is W L and the capital cost is RK. The profits of the firm are given by revenue
minus cost, or AF(K, L) −W L − RK. If the firm takes the price of output as given, the marginal
revenue of increasing the capital input is thus equal to the marginal product of capital. Similarly,
the marginal revenue of increasing labor is equal to the marginal product of labor. If the firm takes
the prices of inputs as given (competitive assumption), then the marginal cost of labor is constant
and equal to W and the marginal cost of capital is constant and equal to R. With these concepts in
hand, we now describe the profit maximization problem.

Theorem 3.2.1 The profit maximization problem consists of choosing the amount of capital and
labor inputs to maximize profit, taking the prices of output and factor input prices as given:

max AF(K, L) −W L − RK

When a firm’s inputs (K, L) maximize profit, then each input’s marginal product is equal to the
price of the factor of production:
1. AFL (K, L) = W
2. AFK (K, L) = R

The theory implies that if the firm is maximizing profit then implications 1.- 2. above hold. The
reason is that if either of these conditions did not hold, then the firm could increase profit by varying
the input. For example consider a specific quantity of inputs (K0 , L0 ) such that AFK (K0 , L0 ) > R.
The extra revenue of renting one extra unit capital is greater than the cost of doing so. Therefore, the
firm could increase profit by a positive amount AFK (K0 , L0 ) − R > 0 by simply using an additional
unit of capital.

Zero Economic Profits


When firms have constant returns to scale and they take the prices of inputs as given, one can derive
the result that economic profits are zero for profit maximizing choices of inputs. The result is that
all output is paid to the factors of production and no output is left as economic profit. To obtain
this result first consider the following mathematical property of production functions with constant
returns to scale.
Y = AF(K, L) = AFK (K, L) × K + AFL (K, L) × L (3.1)
This property follows from Euler’s Homogeneous Equation Theorem.4
3 Since the marginal products will also depend on the quantities of capital and labor employed, the notation indicates
that the relevant marginal products are functions of these factor inputs.
4 Assume that the production function is differentiable and has constant returns: λY = AF(λ K, λ L) holds for all

λ > 0. Differentiation each side of this equation with respect to λ and use the Chain Rule to get Y = AFK (λ K, λ L)K +
28 Chapter 3. Growth Theory

The condition says that total output is equal to the sum of capital times the marginal product of
capital plus labor times the marginal product of labor. From the profit maximization problem, we
know that a profit-maximizing firm that takes factor prices as given will increase the use of each
production factor until the value of its marginal product equals its marginal cost. This was stated in
Theorem 3.2.1. Thus, we can replace the marginal products with the factor prices, to obtain the
equation below.

Y = AF(K, L) = AFK (K, L)K + AFL (K, L)L = RK +W L


The equation says that output for a profit maximizing firm is equal to the sum of the total
payments to capital and labor. So there is nothing left of the firm’s revenue after paying for factors
or that economic profits Y − RK −W L must in theory be zero.5

Cobb-Douglas Production Function


The Cobb-Douglas production function is one of the most widely used functions in economics. It is
a practical functional form that has constant returns to scale and diminishing marginal products. In
addition to these properties, the Cobb-Douglas has the the property that the share of output paid
out to the capital input is always equal to a constant β , independent of the quantities of capital and
labor employed, when factor prices equal marginal products.
Economist and politician Paul Douglas was interested in finding a mathematical production
function that could represent the U.S. production side of the economy. Douglas provided initial
evidence that factor shares in U.S. data have no strong trend movements (although they fluctuate at
business cycle frequencies). Thus, economists view constant factor shares as a useful approximation
at least over some time periods. Douglas and the mathematician Charles Cobb found all the
production functions that had constant returns to scale and constant factor shares. The result is the
Cobb-Douglas production function below.6

Y = AF(K, L) = AK β L1−β where 0 < β < 1 and A > 0 (3.2)

The marginal products of capital and labor, for the Cobb-Douglas, are as follows:

AFK (K, L) = β AK β −1 L1−β = β A(L/K)1−β (3.3)


β −β β
AFL (K, L) = (1 − β )AK L = (1 − β )A(K/L) (3.4)

These marginal products are obtained by taking the derivative of the production function with
respect to K and with respect to L, respectively.7 It is easy to see how these two marginal products
move when capital or labor is varied. For example, in equation (3.3), notice that K appears in the
denominator, and that 1 − β is a positive exponent. This implies that the marginal product of capital
falls when K increases, other things equal. This is the property of diminishing marginal products.
In equation (3.4) note that the marginal product of labor increases as capital increases, other things
equal. Thus, capital and labor are complements in a Cobb-Douglas production function.
AFL (λ K, λ L)L. By setting λ = 1 one proves the assertion. Leonhard Euler was a very famous Swiss mathematician,
who lived in the 1700’s.
5 Table 2.2 from the previous chapter indicated that corporate profits in the US exceeded 10 percent of GDP in 2016.

This fact does not contradict the theory. In calculating corporate profit in the US, corporate accountants do not subtract
the implicit rental value of all the buildings and equipment owned by a corporation when calculating corporate profit.
Thus, it is not surprising that corporate (accounting) profits in the US are positive.
6 See Cobb, C. W.; Douglas, P. H. (1928). “A Theory of Production”. American Economic Review 18: 139- 165.
7 The relevant result from calculus involved in taking this derivatives is that the slope of the function y = xa is given
dy
by dx = axa−1 .
3.2 Production Function 29

We now calculate the share of output paid to capital and to labor when the production function
has the Cobb-Douglas form. We use the implication that factors are paid their marginal products
(i.e. W = AFL (K, L) and R = AFK (K, L)) and the Cobb-Douglas form Y = AK β L1−β .

RK AFK (K, L)K β AK β −1 L1−β K


= = =β (3.5)
Y Y Y
W L AFL (K, L)L (1 − β )AK β L−β L
= = = (1 − β ) (3.6)
Y Y Y
It turns out that the fraction or share of output paid to each factor is fixed for this production function
and does not depend on the exact quantity of the two inputs or the level of technology. Further, the
fraction is controlled by the parameter β so one can adapt the Cobb-Douglas production function
to replicate capital’s share as calculated in the data by just setting β to the value from the data.

Types of Technological Change


Given a production function, technical change or technological progress can be modeled as an
increase in the technological parameter A over time. The literature on technological change distin-
guishes between embodied and disembodied technological change. A disembodied technological
improvement occurs when existing capital and labor benefit from the new technology. In contrast,
an embodied technological improvement occurs when only new capital benefits from the new
technology. An example of an embodied technological improvement is the invention of a faster
computer chip. Without the purchase of the new chip it is not possible to take advantage of the
new technology. This type of technological change is quite common as new technologies are often
embodied in new buildings, new cars and new machines.
In the case of disembodied technical improvement, there are many examples that involve
productive ways of reorganizing the production process. Adam Smith’s famous description of the
“pin” factory is such an example.8 Smith observed that a pin factory could make many more pins
if each worker was devoted to a single task instead of each worker executing sequentially all the
tasks leading to producing one pin. The tasks in making a pin were straightening the wire, cutting
the wire, sharpening the point, attaching the head and putting the pins into a box. Organizing
production in this way did not require any new investment. A similar example is the creation of the
assembly line, which is widely used in the automotive industry. This technological improvement
was vast. At a later stage, automotive assembly involved substantial investment to organize the
assembly line in a way to substitute machines for tasks using labor.
In growth theory it is usually the case that the focus is on disembodied technological change.
The reason is that disembodied technical change is simpler to handle at a theoretical level because
one doesn’t need to keep track of all the distinct types or vintages of capital.
Disembodied technological change can be of three types: capital augmenting, labor augmenting
and neutral. These cases are distinguished by whether technology augments the production services
from capital (KA), augments the production services from labor (LA) or augments both. Solow
growth theory is based on labor-augmenting technological change.

Y = F(KA, LA) = AF(K, L): Neutral (3.7)


Y = F(K, LA): Labor Augmenting (3.8)
Y = F(KA, L): Capital Augmenting (3.9)

So far, in the discussion about the properties of the production function, we had only considered
neutral technical change. However, one can consider a Cobb-Douglas production function with
8 See Adam Smith’s (1776), The Wealth of Nations.
30 Chapter 3. Growth Theory

either labor-augmenting or capital-augmenting technical change, and corroborate that all the
properties of constant returns to scale, decreasing marginal returns and zero profits still hold.

3.2.1 Relative Wages in US Data


An aggregate production function is potentially useful for interpreting data on relative wage rates.
Figure 3.1 plots data on the (log) of the ratio of average college wages to average high school wages
in US data from Goldin and Katz (2007, Table 1).9 The data show that the log wage ratio (also
known as the college wage premium) is U-shaped over the last hundred years. In 1915 college
wages were roughly 90 percent above high school wages.10 This wage ratio fell until 1950 and has
largely increased since then so that in 2005 college wages are again roughly 90 percent above high
school wages. Over the same time period, the (log) ratio of total college labor hours to total high
school labor hours has increased year by year. Goldin and Katz describe the history behind this
trend. For simplicity, the log of the ratio of college to high school hours was normalized or scaled
to equal 0 in 1915.

College Wage Premium: US 1915‐2005
0.7 3
0.6 2.5

Log Supply Ratio
Log Wage Ratio

0.5
2
0.4
1.5
0.3
1
0.2
0.1 0.5
0 0
1900 1920 1940 1960 1980 2000 2020
Year

Log college‐high school wage ratio Log supply ratio

Figure 3.1: US Relative Wages

The neoclassical theory developed in this chapter says that under competitive markets factors
of production get paid their marginal products. Thus, the ratio Wtc /Wth of the college wage to the
high school wage should, according to this theory, equal the ratio of the marginal product of college
to high school labor. The equation below highlights this implication using the assumption of an
aggregate production function Yt = At F(Kt , Lth , Ltc ) with inputs of capital Kt and high school and
college labor (Lth , Ltc ) and technology level At . As in previous sections, subscripts (c, h) next to the
production function are used to denote marginal products of college labor (c) and high school labor
(h).
9 See
“The Race between Education and Technology" by Claudia Goldin and Lawrence Katz, NBER Working Paper
12984.
10 Since the (natural log) of the wage ratio is 0.64 in 1915 according to the US data, then ln W c = 0.64 ⇒ W c = e0.64 =
Wh . Wh
1.90 or that the college wages are above high school wages by 90 percent. Recall that the symbol e = 2.7 is the base for
natural logs. The Wikipedia entry for natural log describes properties of logs.
3.2 Production Function 31

Wtc At Fc (Kt , Lth , Ltc )


=
Wth At Fh (Kt , Lth , Ltc )
To make some initial progress on understanding forces that might shape the college premium,
we will posit specific functional forms for the aggregate production function. We will use the fact
that Goldin and Katz have measured the ratio of aggregate college labor to aggregate high school
labor and have found that the ratio increases over time in the US.

An Unsuccessful Theory
Consider the Cobb-Douglas production function below. The exponents (β , αh , αc ) can be inter-
preted as the share of output paid to capital and to high school and college labor respectively. The
term At captures neutral technological change. The second equation below computes the marginal
products and simplifies by canceling common terms in the numerator and denominator. The third
equation below takes the log of both sides of the second equation and simplifies.

Yt = At F(Kt , Lth , Ltc ) = At Kt (Lth )αh (Ltc )αc


β

At Fc (Kt , Lth , Ltc ) αc At Kt (Lth )αh (Ltc )αc −1


β
Wtc αc Lth
= = =
Wth h c
At Fh (Kt , Lt , Lt ) αh At Ktβ (Lth )αh −1 (Ltc )αc αh Ltc
Wtc αc Lth αc Ltc
log = log + log c = log − log
Wth αh Lt αh Lth
The theory implies that variation in the log of the relative wage is pinned down by variation in
the log of the relative supply of the two types of labor. This should be intuitive as the Cobb-Douglas
production function displays diminishing marginal products and, therefore, increases in an input
decrease its marginal product other things held equal. If total college labor grows faster than total
high school labor (as the US data in Figure 3.1 show) then the log of the ratio of college to high
school wages must decrease regardless of the technology level At or the precise values for the
exponents in the Cobb-Douglas production function. This prediction is not qualitatively consistent
with US data.

A More Successful Theory


Goldin and Katz posit the production function below with two types of technological change:
At and λt . At is neutral technological change. An increase in At increases all marginal products
proportionally, other things equal. λt is a source of technological change which complements
college labor but not high school labor. Thus, other things equal, an increase in λt increases the
marginal product of college labor and decreases the marginal product of high school labor. Other
things equal, an increase in neutral technological change At does not change the wage ratio because
a neutral technological increase lifts both marginal products proportionally.
Below we repeat, for the Goldin-Katz formulation of the production function, the type of
calculations that we made for the "unsuccessful theory". Thus, to get the second line beow we (i)
maintain that wage ratios are ratios of marginal products and (ii) calculate the ratio of marginal
products and simplify the algebra. The third line below takes logs of the equation in the second line
below.

Yt = At F(Lth , Ltc , λt ) = At [λt (Ltc )ρ + (1 − λt )(Lth )ρ ]1/ρ

Wtc At Fc (Kt , Lth , Ltc ) λt (Ltc )ρ−1


= =
Wth At Fh (Kt , Lth , Ltc ) (1 − λt )(Lth )ρ−1
32 Chapter 3. Growth Theory

Wtc λt Ltc
log = log + (ρ − 1) log where ρ − 1 < 0
Wth 1 − λt Lth
This theory suggests that variation in the log wage ratio over time is a race between two forces:
education and technology. On the one hand, increases in technology that complements college labor
(e.g. computers and information technology) increase the college premium. This is captured by the
λt
term 1−λ t
. On the other hand, increases in the relative supply of college labor is a force decreasing
c
the college premium. This is captured by the term LLth . Goldin and Katz argue that technology has
t
won this race since the 1950’s and accounts for the increase in the college premium.

3.3 Solow Growth Model


It is helpful to have some facts in mind when formulating theories. We now enumerate six important
facts about economic growth that guide growth theories.

Definition 3.3.1 — Kaldor’s Growth Facts. The following six facts about (modern) economic
growth were set down by Nicholas Kaldor in 1957:a
1. Output per capita grows over time.
2. Capital per capita grows over time.
3. The capital-output ratio is approximately constant over time.
4. Capital and labor’s share of output is approximately constant over time.
5. The return to capital does not have a strong trend.
6. Levels of output per capita vary widely across countries.
a See Kaldor, Nicholas (1957). "A Model of Economic Growth". The Economic Journal. 67 (268): 591–624.

A key issue is the degree to which these “facts” describe the behavior of particular countries
over particular time periods. We will only partially address this issue. For fact 1 above, we present
data constructed by Gregory Clark for England and by many authors for the UK as compiled by the
Bank of England.11
Figure 3.2 below plots GDP per capita in the UK and Net National Income (NNI) per capita in
England over several centuries. Labor productivity, measured by GDP per capita or by NNI per
capita, displays sustained growth in the UK since sometime around 1800 or so. Clark’s data shows
that there was not sustained growth in England over the period 1300-1700. Thus, for the UK or
England there is support for Kaldor’s first fact starting sometime around 1800 or slightly earlier.12
The time period starting after roughly 1800 is sometimes referred to as the period of modern
economic growth. Before this time period, average real yearly growth rates over long time periods
(e.g. centuries) for the most advanced economies are believed to be approximately zero. This is
consistent with the findings of Clark for England. Angus Maddison calculates that the average
yearly growth rate of real GDP per man hour in the UK from 1700-1780 was 0.3 percent.13 Thus,
during this period, the average yearly growth rate was quite small in comparison to growth rates for
many advanced economies over the last century.
11 See Gregory Clark, "The Macroeconomic Aggregates for England, 1209-1869" Research in Economic History,
2010. The data for the UK is taken from The Bank of England’s Three Centuries Macroeconomic Dataset Version 2.3 -
30 June 2016.
12 It is striking that Clark (2010) can construct a time series measuring net national income over this period. He uses

the factor incomes approach : NNI = Wages + LandRent + NetHouseRent + OtherCapitalIncome + IndirectTax. His
measure of Wages, in a given year, is based on the English population, measures of farm and non-farm wages per day,
and an assumption of average days of work in the year.
13 See Table 2.2 in Angus Maddison’s (1991) work “Dynamic Forces in Capitalist Development”, Oxford University

Press.
3.3 Solow Growth Model 33

GDP and Labor Productivity 
5000000
500000
(Log Scale)

50000
5000
500
50
5
1200 1300 1400 1500 1600 1700 1800 1900 2000
Year

GDP/Pop UK NNI/Pop England GDP UK NNI England

Figure 3.2: Labor Productivity: United Kingdom and England

3.3.1 The Basic Solow Model


We will build up to the most general formulation of Solow’s growth model in two steps. The model
introduced in the first step will be called the “basic” Solow growth model. The model created in the
second step, the full Solow model, will be able to explain Kaldor’s growth facts 1-5. A satisfactory
quantitative explanation for Kaldor’s sixth fact is still an open problem in growth theory. Whoever
develops a convincing explanation for the magnitude of the observed differences in output per
capita across countries should be a good candidate to receive a Nobel Prize in economics.
Output is produced via an aggregate production function. According to this function, output
can grow because At , Kt , and/or Lt grow over time. There is no other way for output to grow. In
the basic version of the Solow model, we assume that A = 1 and that Lt = L. So that the only
time-varying variable determining output is Kt . The basic Solow growth model is thus described by
the three equations below.

Ct + It = F(Kt , L) (3.10)
Kt+1 = Kt − δ Kt + It (3.11)
It = sF(Kt , L) where 0 < s < 1 (3.12)

The first equation says that from total output F(Kt , L) the amount Ct is consumed and the remainder,
It , is invested. The second equation says that capital in period t + 1 is equal to the capital from
period t, given by Kt , minus the amount of capital that is lost because of depreciation, given by
δ Kt , plus investment in new capital It . Parameter δ is between zero and one and is known as the
depreciation rate. One interpretation is that the depreciation rate is the rate at which buildings
crumble over time without repair. The third equation says that a fixed fraction of output s is invested
into new capital each period. The third equation is a behavioral rule, as it involves a decision,
whereas the first two assumptions are not about decisions.
To analyze the Solow model, combine equations 3.11 and 3.12 to obtain capital in period t + 1
in terms of capital and labor in period t.
34 Chapter 3. Growth Theory

k t +1 k t +1 = k t

k t +1 = (1 - d ) k t + sF ( k t ,1)

k* kt

Figure 3.3: Basic Solow Model

Kt+1 = sF(Kt , L) + (1 − δ )Kt


This equation describes the evolution of the capital stock Kt for an economy with L workers.
Divide both sides of the equation by L and use the definition of constant returns to scale to obtain
the law of motion for capital per worker, we do so in the following three steps. We adopt the
notational convention that kt = Kt /Lt denotes capital per worker.

Kt+1 1 Kt
= sF(Kt , L) + (1 − δ ) (3.13)
L L   L
Kt+1 Kt L Kt
= sF , + (1 − δ ) (3.14)
L L L L
kt+1 = sF(kt , 1) + (1 − δ )kt (3.15)

This last equation is the key law of motion that describes the movement of capital per worker
over time in the Solow model. The equation says that capital per worker next period equals
investment per worker this period plus existing capital per worker after depreciation. Figure 3.3
describes the basic Solow model. The graph has capital per person this period, kt , on the horizontal
axis and capital per person next period on the vertical axis, kt+1 . We plot two functions in this
graph. The first is the identity function kt+1 = kt . Points along this line represent situations in
which capital per person does not change over time. The second function we plot is equation 3.15,
which describes how capital per worker evolves from one period to the next in the Solow model.
The intersection of the two curves is a level of capital per worker k∗ capital per worker that does not
change over time in the Solow model. We will refer to this situation as a steady state of the model.
It is clear that the assumption that the marginal product of capital diminishes as capital increases is
the key reason for why the law of motion for capital bends over in Figure 3.1. This implies that
there is at most one steady state value for k∗ other than zero. There will be one such value when the
marginal product goes to zero as capital gets sufficiently large.
In order to interpret the steady state, replace the steady state level of capital in the law of motion
for capital per worker, to obtain 0 = −δ k∗ + sF(k∗ , 1), rearranging, we obtain δ k∗ = sF(k∗ , 1). This
3.3 Solow Growth Model 35

equation says that, in the steady state of the simple Solow model, investment equals depreciation.
That is, the amount invested every period is just enough to compensate for the deterioration of the
stock of capital. Therefore, capital per worker does not change over time.
An interesting question is whether the economy will move toward the steady state level of

capital over time. For example, suppose the stock of capital is kt = k2 . How will the stock of capital
evolve over time? This question asks about nature of the dynamics of the basic Solow model. It
turns out that, using Figure 3.1, one can conclude that the economy will move gradually toward k∗
from any starting point kt > 0 as t increases. The following theorem explains this.

Theorem 3.3.1 — Dynamics of the simple Solow Model. In the simple Solow model, if
capital per worker is positive, then capital per worker moves gradually towards the steady state
k∗ over time.

Proof. Consider any point kt > 0 on the horizontal axis. If the point is to the left of k∗ then, at
that point, the law of motion of the Solow model lies above the identity line, so one knows that
kt+1 > kt . One can also see that, because the law of motion is increasing, kt+1 < k∗ . This means
that the capital stock per person will be larger next period, but not larger than k∗ . Thus, if capital per
worker is to the left of k∗ , capital per worker will move part of the way toward k∗ by next period.
Now consider any point kt > k∗ . At any such point, the law of motion lies below the identity
line. This implies, that according to the law of motion of the basic Solow model, kt+1 < kt so that
capital per worker will be lower next period. Again, since the law of motion is increasing, kt+1 > k∗ .
Thus, if capital per worker is to the right of k∗ , capital per worker will move part of the way toward
k∗ by next period. This completes the proof. 

An alternative way to answer this question is to construct a numerical example and simulate the
time path of capital per person. We do so next.
 Example 3.1 — Basic Solow Model: A Numerical Example. To see how the basic Solow
model works at a mechanical level it is useful to consider a numerical example. Specifically, we
will describe a particular production function and particular values of all parameters (e.g. the
depreciation rate, the savings rate and all parameters describing the production function). Once
this is done, we will use the basic equation of the Solow model to compute how values of the
capital-labor ratio and other variables change over time. All calculations can be done with a
standard spreadsheet, with any programming language or by hand calculator.

Table 3.1: Parameterizing the Solow Model

Production Function Y = F(K, L) = AK β L1−β , A = 1.0, β = 0.3


Depreciation Rate δ = .04
Savings Rate s = 0.1

The key equation of the Solow model which describes dynamics is given in the first equation
below. This equation is written in terms of the capital-labor ratio. To use this equation we have to
express the production function in terms of ratios to the labor input. This can be done for the Cobb-
Douglas function Y = F(K, L) = AK β L1−β simply by dividing both sides by labor L to get that
y = Akβ . The second equation below then follows from the first by substituting y = F(kt , 1) = Akβ
into the first equation. The lowercase symbol y = Y /L is output per worker or the output-labor
ratio.
36 Chapter 3. Growth Theory

Table 3.2: Time Paths in the Solow Model

Capital Output Investment Consumption


y = Akβ i = sy c = (1 − s)y
k0 = 1.0 y0 = 1.0 i0 = .10 c0 = .90
k1 = 1.060 y1 = 1.017 i1 = .101 c1 = .915
k2 = 1.119 y2 = 1.034 i2 = .103 c2 = .930
... ... ... ...
k∞ = 3.69 y∞ = 1.47 i∞ = .147 c∞ = 1.323

kt+1 = sF(kt , 1) + (1 − δ )kt (3.16)


β
kt+1 = sAkt + (1 − δ )kt (3.17)

Table 3.2 calculates time profiles for several periods for a number of variables in the Solow
model. Table 3.2 is based on the assumption that the initial capital-labor ratio equals 1 (i.e. k0 = 1.0)
and uses the parameters stated in Table 3.1. First, the time profile for the capital-labor ratio is
calculated using equation 3.17. All the other profiles are calculated using the profile for the capital-
labor ratio because output, investment and consumption are simple functions of the capital-labor
ratio. The final steady state quantities are also indicated at the bottom of the table. This numerical
example illustrates that the capital-labor ratio converges to the steady-state capital-labor ratio over
time. 

 Example 3.2 — Basic Solow Model: An Empirical Application. In the basic version of the
Solow model, we have so far assumed that labor input Lt and technology At do not change over
time. Now we will apply the basic model when the population exogenously changes over time.
Figure 3.4 plots data on the population in England from the work of Gregory Clark.14 The
English population was roughly 5 million around 1300 but decreased in several steps to around 2.5
million by 1400. According to Clark, the population was roughly 4.5 million in 1348 but was 3.5
million in 1350. If you Google “The Black Death in England” you will find a Wikipedia page that
states: “The Black Death was a pneumonic plague pandemic, which reached England in June 1348,
and died down by December 1349. It was the first and most severe manifestation of the Second
Pandemic, caused by Yersinia pestis bacteria.” We will view the fall in the population of England
as due to an exogenous shock related to the arrival of the Black Death.
Clark also calculates measures of the real wage for farm workers and for non-farm workers. His
measures for wage rates can be calculated more frequently than his measure of the population of
England. He finds that the real wage for both labor types spikes upwards when the population falls.
The percentage change in the real wage for farm work increases substantially when the population
falls and remains above the pre-1348 farm wage level. We take two main facts away from Figure
3.4. First, the English population falls after 1348. Second, average real wage rates increase in
England after 1348 compared to their previous level and particularly so for farm wages.
The key thing that we want to explain is why did wage rates move in an opposite direction
from the movement of the population? We will use the Solow model and the assumption that the
plague is an exogenous shock that kills people (L) but not physical capital (K) such as buildings,
roads, ships and farmlands. We will view it as a one-time shock for theoretical convenience that
14 See Gregory Clark, "The Macroeconomic Aggregates for England, 1209-1869" Research in Economic History,

2010.
3.3 Solow Growth Model 37

England: Black Death
0.5

0.4

0.3

0.2

0.1

0
1340 1345 1350 1355 1360 1365 1370
Time

Farm Wage Nonfarm Wage Population (millions/10)

Figure 3.4: England- Wage and Population

permanently lowers the population of workers in the model and that this shock happens in one
model period.
The theoretical model implies that the capital-labor ratio k = K/L immediately increases due
to the shock. If k was initially at a steady state level k∗ before the shock - see Figure 3.3 - then k
immediately moved to the right after the shock. People were destroyed but not capital. After the
shock, the model implies that the capital-labor ratio slowly returns to the previous steady state level
k∗ absent future shocks. Of course, this implies that the new total capital stock in steady state is
now lower than before the plague.
What does the model imply for the wage rate? We assume that factors of production (labor and
capital) get paid their marginal products. Thus, the real wage rate w in the model is determined
in theory by the level of the technology and by the capital-labor ratio k. For a Cobb-Douglas
production function - see below - the implication is that the wage rate is higher immediately
after the shock than before the shock. This is simply because a relatively high capital-labor ratio
produces, technology held constant, a relatively high marginal product of labor.

K
w = FL (K, L) = (1 − β )AK β L−β = (1 − β )A( )β = (1 − β )Akβ
L
The model also implies that the wage rate will slowly decrease over time (absent future shocks)
and will return to the previous steady state level determined by the production function and k∗ . In
summary, the theory predicts that wages increase immediately after the shock and then gradually
decrease back to the steady state value absent other shocks. The model qualitatively predicts the
increase in the wage rate for farm workers that Clark documents.

38 Chapter 3. Growth Theory

3.3.2 The Full Solow Model


In this version of the Solow model, we assume that both the population and technology grow at
constant rates over time. This version covers the simple version of the last section as a special case.
The question that the model will answer is thus: suppose that, over time, technological progress
is such that the economy can produce more and more with the same inputs. Also, assume that the
number of people increases over time at a fixed rate. How will output per worker and capital per
worker evolve over time if a fixed fraction of output is invested every period?
There are two things to be noted about the way Solow introduced technological change. First,
technological change is called labor augmenting as improvements in the technology act to increase
“effective” labor. The quantity Lt At will be called effective labor. Second, technological change is
disembodied in the sense that all existing workers and capital can be used in the production process
as the technological level, At , increases. Another way to put this is that no worker or machine
becomes obsolete as new technologies arrive. The following 5 equations describe the Solow model.

Ct + It = F(Kt , Lt At ) (3.18)
Kt+1 = It + (1 − δ )Kt (3.19)
It = sF(Kt , Lt At ) (3.20)
Lt+1 = Lt (1 + n) (3.21)
At+1 = At (1 + g) (3.22)
Compared to the equations of the simple Solow growth model, these equations now have
effective labor in the production function and also have two laws of motion for labor and technology.
The growth rates of technology and population are denoted with the symbols g and n, respectively.
In the previous section, it was possible to analyze the steady state and the dynamics of the
simple Solow model by expressing the law of motion in terms of capital per unit of labor. In this
section, our goal will be to find a law of motion for capital per unit of effective labor.
The first step we take is to find the growth rate of effective labor. To do this, we combine the
laws of motion of labor and technology which appear in equations 3.21 and 3.22 above:

Lt+1 At+1 = Lt At (1 + n)(1 + g)


We observe that the growth rate of effective labor equals a composition of the growth rates of
technology and labor. The next step we take is to divide both sides of the law of motion for the
capital stock by Lt At . We do so in four steps, below:
Kt+1 1 Kt
= sF(Kt , Lt At ) + (1 − δ ) (3.23)
Lt At Lt At Lt At
 
Kt+1 Kt Lt At Kt
= sF , + (1 − δ ) (3.24)
Lt At Lt At Lt At Lt At
Kt+1
= sF(kt , 1) + (1 − δ )kt (3.25)
Lt At
kt+1 (1 + g)(1 + n) = sF(kt , 1) + (1 − δ )kt (3.26)
In the simple Solow model, three steps were sufficient but now we need and extra step. Let’s
review the steps. In the first step we divide both sides of the law of motion by Lt At . In the second
step we use the definition of constant returns to scale to re-express the inputs in the production
function. The third is notational as it replaces the ratio Kt /(Lt At ) with kt , which is now defined as
capital per effective unit of labor. The last step uses the law of motion for effective units of labor in
order to obtain kt+1 . We can simplify this expression to obtain the law of motion for capital per
effective unit of labor in the Solow model:
3.3 Solow Growth Model 39

k t +1 k t +1 = k t

(1 - d ) k t + sF ( k t ,1)
k t +1 =
(1 + g )(1 + n)

k* kt

Figure 3.5: Full Solow Model

Definition 3.3.2 Law of Motion of the Solow Growth Model: The following equation describes
the dynamics of capital per unit of effective labor in the Solow Growth model as a function of
the saving rate s, the depreciation rate δ , population growth rate n, technological growth rate g
and the shape of the production function F:

sF(kt , 1) + (1 − δ )kt
kt+1 =
(1 + n)(1 + g)

Figure 3.5 describes the full Solow model. Note that Figure 3.2 and Figure 3.5 look very similar,
but they differ in two important ways: the kt in Figure 3.2 was defined as Kt /L whereas the kt in
Figure 3.5 is defined as Kt /(Lt At ). The second difference is that the law of motion is now divided
by (1 + g)(1 + n).
The intersection of the identity function kt+1 = kt and the law of motion determines the steady
state level of capital per unit of effective labor of the economy. Therefore, at k∗ , capital per effective
unit of labor in the economy becomes constant over time.
As in the simple Solow model, the dynamics of the model say that, if capital per unit of effective
labor is positive, it will move gradually towards k∗ over time. Thus, we can formulate a theorem.

Theorem 3.3.2 — Dynamics of the Solow Model. In the Solow model, if capital per unit of
effective labor is positive, then capital per effective unit of labor moves gradually towards the
steady state k∗ over time.

Proof. See the proof of Theorem 3.1.1. 

Let’s now analyze the steady state of the economy. In steady state, capital per effective unit of
labor is equal to k∗ , so that Kt /At Lt = k∗ . Therefore

Kt = k∗ At Lt

This equation shows that, in steady state, the capital stock is proportional to effective labor. For
this reason, the stock of capital Kt grows at rate (1 + g)(1 + n) − 1 in the steady state. This is the
40 Chapter 3. Growth Theory

growth rate of effective labor which is exogenous to the model. Output is given by the production
function Yt = F(Kt , Lt At ). If capital is growing according to the steady state equation then we have
Yt = F(k∗ At Lt , At Lt ). By constant returns to scale, we have that

Yt = At Lt F(k∗ , 1).

So output also grows at rate (1 + g)(1 + n) − 1 also. Immediately, we realize that investment It = sYt
also grows at rate (1 − g)(1 + n) − 1 and consumption, Ct = (1 − s)Yt also grows at this rate. This
follows because, by assumption, investment and consumption are fixed fractions of output dictated
by s and 1 − s, respectively. Thus, we have determined that all these aggregate variables grow at
constant rates in a steady state of the Solow model.

3.4 Is the Solow Model Consistent with Kaldor’s Facts?


We state Kaldor’s facts and then relate them to the theoretical results obtained in the previous
section. The facts are:
1. Output per capita grows over time.
2. Capital per capita grows over time.
3. The capital-output ratio is approximately constant over time.
4. Capital and labor’s share of output is approximately constant over time.
5. The return to capital does not have a strong trend.
6. Levels of output per capita vary widely across countries.
In a steady state of the Solow model, that is, if (i) there is an aggregate production function
with constant returns to scale and diminishing marginal products, (ii) Lt and At grow at constant
rates, and (iii) investment in capital is a fixed share of ouput s, then:
1. Yt grows at rate (1 + g)(1 + n) − 1 and output per worker grows at rate g, a constant rate.
2. Kt grows at rate (1 + g)(1 + n) − 1 and capital per worker grows at rate g.
3. In steady state the capital-output ratio Kt /Yt is a constant as capital and output grow at the
same rate by conclusion 1 -2.
4. If the production function is Cobb-Douglas, then the capital and labor shares of output are
constant and equal β and 1 − β .
5. The return to capital is also called the real interest rate r or the net payoff from investment.
We argue that the real interest rate is constant in a steady state of the Solow model. First,
if an extra unit of capital is invested at time t − 1 then the total payoff at time t from this
investment is the marginal product of capital (the rental price of capital) plus (1 − δ ) which is
the remaining unit of capital after depreciation. Thus, the net payoff is r = FK (Kt , At Lt ) − δ .
Second, if the production function is Cobb-Douglas, the marginal product of capital is
 1−β
At Lt
FK (Kt , At Lt ) = β = β (k∗ )β −1 ,
Kt

which is a constant in steady state. Since the depreciation rate is constant then so is the return
to capital r in steady state.
Note that we provided a quick and clean answer in 4 and 5 by specializing the production
function to Cobb-Douglas and using results derived previously in this chapter. It turns out that
the conclusion for property 4 and 5 hold, in steady state, regardless of whether the production
function is Cobb-Douglas. We do not to present the argument when the production function is not
Cobb-Douglas as it is somewhat technical.
The only remaining fact that the Solow model has yet to explain is Kaldor’s sixth fact. In the
Solow model, different countries can have different steady state level of output per effective unit of
3.5 Cross Country Comparisons 41

labor and different growth rates if they have (i) different saving rates s, (ii) different population
growth rates n or (iii) different growth rates of technology g. In the next section we examine how
far measured differences in saving rates can go in explaining cross country differences in output,
assuming the same technology and the same (g, n) values for all countries.
The situation in which two countries differ in saving rates can be illustrated in Figure 3.3. By
shifting the law of motion up or down. Specifically, a higher savings rate will imply higher kt+1
at every level of kt thus shifting the law of motion upwards. A similar analysis can be conducted
when the population growth rate is shifted. A different state level of capital per unit of effective
labor is prescribed. The figure is useful in ranking steady states.

3.5 Cross Country Comparisons

Up to this point, we have focused mostly on developing the logic of how the Solow model works.
An interesting issue is whether or not broadly the model seems to make sense of data. The previous
section argued that the steady state of the model is consistent with Kaldor’s growth facts 1-5. What
is not clear is the extent to which the model is consistent with Kaldor’s sixth fact: Y /L differs widely
across countries at a point in time. The US has a level of GDP per capita that is approximately 30
times that of some very poor countries.15
While a careful quantitative analysis of the degree to which the Solow model is in agreement
with such facts is beyond the scope of this book, it may be useful to lay out some facts and some
opinions about the state of the literature. First, cross-country data does show that countries with
high measured Y /L also typically have a high measured K/L. This is good news for a theory that
requires that Y /L = F(K/L, A) and that maintains as a provisional assumption that A is common
across countries. Second, countries with high measured Y /L typically have a high measured
investment rates I/Y over long time periods. This also seems to be good news as within the Solow
model a high investment rate (i.e. a high s) in steady state is the means of attaining high K/L and
high Y /L, given the assumption that technology is common across countries.
Now one can approach this second fact from a different angle, to see if this is really good news
for the Solow model. One could ask first what are the savings or investment rates at the high and
low end of the distribution. One can find very low investment countries, like Egypt, Chad or Uganda
averaging s2 = .05 or less over several decades and very high investment countries, like China or
Singapore, averaging s1 = .30 for a few decades. One could then ask whether such differences lead
in theory to big steady state differences in Y /L or Y /LA, holding technology A, depreciation δ ,
population growth n and technological growth g constant across countries.
How much does steady state output per worker change as s changes, other things equal? To
answer this question, we first find k∗ , which is determined by the intersection of the identity line and
the law of motion in Figure 3.2. Consider the Cobb-Douglas production function y = F(k, 1) = kβ .
Plugging this function in the law of motion for capital per effective unit of labor in equation 3.26,
and imposing kt+1 = kt = k∗ we obtain:

k∗ (1 + g)(1 + n) = s(k∗ )β + (1 − δ )k∗

15 Facts relating to differences in GDP per capita across countries at a point in time and across time periods are presented

in Parente and Prescott’s paper entitled “Changes in the Wealth of Nations”, Federal Reserve Bank of Minneapolis
Quarterly Review, 1994. In that work differences in GDP per capita across countries at a point in time are measured
using a common set of world prices.
42 Chapter 3. Growth Theory

We can solve for k∗ in 3 algebra steps:

k∗ [(1 + g)(1 + n) − (1 − δ )] = sk∗β (3.27)


s
k∗(1−β ) = (3.28)
(1 + g)(1 + n) − (1 − δ )
1
k∗ = ζ s 1−β (3.29)

Where ζ = (1/((1 + g)(1 + n) − (1 − δ )))1/(1−β ) is a constant depending on the values of n, g,


δ and β . With the expression for k∗ as a function of the saving rate in hand, we can find output
per effective unit of labor as a function of the saving rate. This is done by plugging k∗ into the
production function, which gives
β
y∗ = k∗β = ζ β s 1−β
According to this expression, the ratio of steady-state output per effective worker for two countries
with saving rates s1 and s2 is given by
β β
y∗1
   
s1 1−β 0.30 1−β
= = .
y∗2 s2 0.05

Note that this ratio will not depend on ζ , only on β and the saving rates. It would depend on ζ
if n or g were different across countries. If we take a stand on the value parameter β and set β = .30,
y∗ .3
which is a ball-park number for capital’s share in the U.S., then the ratio is y1∗ = 6 .7 ≈ 2.15.
2
Recall that this is the ratio of output per effective unit of labor across countries in steady state
of the Solow model. If technology is equal across countries, then we interpret 2.15 as the ratio of
output per worker across countries that would hold in steady state when only the savings rates differ
among the exogenous inputs to the Solow model. This is a tiny ratio compared to the factor of 30
differences observed in cross-country data. Even if β = .5, then the ratio is 6. The upshot is that
steady-state differences in output implied by the Solow model and measured differences in saving
rates alone are quite small compared to output differences measured in data. Thus, we conclude
that something other than savings rates in physical capital must be very important in accounting for
observed cross-country differences.
One mantained assumption in applying the Solow model to interpret cross-country data, for
example the analysis of savings rate differences above, is that the technology level is the same across
countries. This assumption seems likely to be incorrect. We now indicate how in principle one
might go about trying to indirectly infer technological differences across countries. The framework
described below allows country i’s GDP per worker denoted Yi to be determined by the technology
Ai and the per worker input of capital Ki and the per worker quality adjusted labor Li in country i.
It is typical in this literature to use a Cobb-Douglas production function and an empirical estimate
of capital’s share β . The basic idea is then to measure (Yi , Ki , Li ) in a cross section of countries at a
point in time and then to back out technology Ai for each country i.

Yi = F(Ki , Li , Ai ) = Ki (Li Ai )1−β


β
(3.30)
β 1−β
Ai = [Yi /Ki Li ]1/(1−β ) (3.31)

The literature which carries out this type of calculation is surveyed by Caselli (2005).16 A key
issue is then to have a measure of worker quality. In practice economists use data on the distribution
16 See Francesco Caselli (2005), Accounting for Cross-Country Income Differences, Handbook of Economic Growth,

Chapter 9.
3.6 Growth Accounting 43

of the workforce by experience (years worked) and by years of schooling. The idea is that in
cross-section data earnings increase with both experience and schooling and thus workers with
high experience and schooling are more productive and, hence, are of higher quality. To the degree
that rich countries have a distribution of workers with higher experience and higher schooling than
poor countries, then these are the proximate reasons providing empirical support for rich countries
having larger quality adjusted labor input Li per worker and, thus, higher output per worker.
A typical finding from this literature (see Caselli (2005)) is that rich countries (i.e. countries
with high Y ) have relatively high capital per worker K and labor quality per worker L. Thus,
variation in measured factor inputs (K, L) accounts for some of the output per worker differences
across countries. However, these measured differences do not go very far to explain all of the output
per worker Y variation across countries. Rich countries are inferred to have a higher technology
level A than poor countries based on equation 3.31 and cross-country data. Technology differences
turn out to be a quantitatively very important source of GDP differences.
Some recent work by Lagakos, Moll, Porzio and Qian (2012) argues that better measurement
of labor quality differences across countries substantially reduces the importance of technology
differences.17 They argue that differences in capital and labor quality explains approximately
two-thirds of the measured ratio of GDP per capita of the country at the 90th percentile of the
distribution compared to GDP per capita of the country at the 10th percentile. If this result proves
to be widely supported in the data, then the key question in the literature is what accounts for
such measured differences in labor quality across countries. Of course, the Solow growth model is
silent on the sources of these differences as it is not a theory of worker quality differences. The
dominant body of work that provides theory for quality differences is the literature on human capital
accumulation.18

3.6 Growth Accounting

Growth accounting is a tool for dividing up output growth into distinct sources. This tool can
be used to answer two types of questions. The first type asks what portion of observed output
growth in a country (or even a firm) over some period of time can be accounted for by changes in
technology versus the portion that can be accounted for by changes in factor inputs. The second
type of question asks what would be the effect on output growth of a change in the technology or a
change in some specific factor input, other things equal.
In questions of the first type, growth accounting tells one where growth comes from. However,
it does not tell one why the economy functions in this way. Here, the analogy with financial
accounting is apt. An accountant may be able to tell you where the income of a firm or government
comes from based on the data, but an accountant may not have any theory explaining why it is
the case that income comes from these distinct sources. To answer the latter question one needs a
theory and not merely an accounting framework.

3.6.1 Growth Accounting: Theory

We will now lay out the theory behind growth accounting. Solow assumed that there is an aggregate
production function Yt = At F(Kt , Lt ). Thus, aggregate output Yt is produced when the technology
level equals At and the factor inputs of capital and labor are Kt and Lt , respectively.

17 Lagakos, Moll, Porzio and Qian (2012), Experience Matters: Human Capital and Development Accounting.
18 Gary Becker received the Nobel Prize in 1992 in part for his work on human capital.
44 Chapter 3. Growth Theory

Yt = At F(Kt , Lt ) (3.32)
Ẏt = Ȧt F(Kt , Lt ) + At FK (Kt , Lt )K̇t + At FL (Kt , Lt )L̇t (3.33)
Ẏt Ȧt F(Kt , Lt ) At FK (Kt , Lt )K̇t At FL (Kt , Lt )L̇t
= + + (3.34)
Yt Yt Yt Yt
Ẏt Ȧt At FK (Kt , Lt )Kt K̇t At FL (Kt , Lt )Lt L̇t
= +( ) +( ) (3.35)
Yt At Yt Kt Yt Lt
Ẏt Ȧt K̇t L̇t
= + β + (1 − β ) (3.36)
Yt At Kt Lt

Based on this assumption we now derive the Solow growth accounting formula. Equation
3.33 differentiates the production function with respect to time and denotes a time derivative of
a variable with a dot above the variable. Thus, Ẏt denotes the time deriviative of output which
would be written in familiar calculus notation as follows: Ẏt = dY t
dt . Equation 3.33 says that the time
derivative of output Ẏt equals the effect of technological change plus the effect on output from the
change in capital plus the effect on output from the change in labor. Put slightly differently, output
changes only because technology changes or because factor inputs change over time.
It just remains to reorganize this expression in a useful way. Equation 3.34 divides by output so
that the left hand side of equation 3.34 is the output growth rate. Equation 3.35 reorganizes equation
3.34 so that the quantities in parentheses turn out to have natural interpretations in terms of data.
The quantity At FK (KYtt ,Lt )Kt is interpreted as capital’s share of output and is relabeled β in equation
3.36 whereas the quantity At FL (KYtt ,Lt )Lt is interpreted as labor’s share of output and is relabeled
(1 − β ) in equation 3.36. Notice that the numerator term in each expression is the marginal product
of capital or labor times the amount of capital or labor. In competitive thoery this amount equals
the total payment to capital and labor respectively.
Equation 3.36 is the Solow growth accounting equation. It says that output growth equals
technology growth plus the effect on output coming from the growth of capital and labor, respec-
tively. Capital growth is weighted by capital’s share of output β = At FK (KYtt ,Lt )Kt and labor growth is
At FL (Kt ,Lt )Lt
weighted by labor share of output 1 − β = Yt . In competitive theory, capital and labor’s
share sum to one with constant returns.
Definition 3.6.1 The Solow Growth Accounting Equation is stated below in two different ways:
Ẏt Ȧt
Yt = At + β K̇Ktt + (1 − β ) L̇Ltt and ∆Yt
Yt = ∆At
At + β ∆K t ∆Lt
Kt + (1 − β ) Lt

When working with data it is convenient to approximate the instantaneous growth rates in the
theory with growth rates calculated using measured variables in successive time periods. Thus, for
example, the instantaneous growth rate of output Ẏt /Yt is approximated as ∆Yt /Yt = (Yt+1 −Yt )/Yt .
It is conventional to use the capital delta symbol ∆ to indicate a change in a variable. We follow the
same convention for factor inputs by replacing instantaneous growth rates with those calculated
using data in neighboring time periods.

3.6.2 US Economy: 1909-49 and 1949-2016


Solow used the Solow growth accounting equation to divide US output growth over the period1909-
49 into components. He could measure the output growth rate across each year of his sample as
well as the growth rate of the two inputs labor and capital. He could also measure capital and
labor’s share of output (βt , 1 − βt ) in each year. Thus, he could construct empirical measures of all
the inputs to the growth rate formula except the growth rate of the technology.
3.6 Growth Accounting 45

Solow measured the technology growth rate indirectly by backing it out of the formula in each
year. Thus, the technology growth rate was calculated as a residual (i.e. whatever was needed to
make both sides of the equation hold with equality period by period). Thus, if the theory is correct,
then the measured technological growth rate will equal the true growth rate plus a term capturing
measurement errors in the measured variables.
The main findings of Solow’s analysis are listed below and follow directly from the growth
accounting equation and US data from 1909 to 1949.
1. Output per unit of labor input grows by about 100 percent in 1909-1949.
2. The capital-labor ratio grows by about 30 percent in 1909-1949. This is sometimes called
“capital deepening”.
3. Technology grows by about 80 percent over the period. Thus, about 80 percent of the
growth in output per unit of labor input over the period is accounted for by growth in the
technology and the remaining 20 by increases in the capital-labor ratio. The overall growth
in the technology over the period was computed as follows using the measured growth rates:
At+1 = At (1 + ∆At /At ) and setting A1909 = 1.
4. The measure of the technology level falls in a number of recession and depression years and
tends to increase in expansions over the time period 1909-49. Thus, measured technology
growth rates are “procyclical”.
We will apply the growth accounting equation to US data from the Bureau of Labor Statistics
(BLS) over the period 1949-2016. The measures of real output and inputs are constructed by the
BLS for the private, non-farm business sector. The aggregate measure of labor input Lt is based on
hours of work in the private, non-farm business sector weighted by relative compensation.19 The
measure of capital input Kt is a combination of the various types of capital (e.g. land, structures,
equipment, ...). The measure of capital βt and labor’s share (1−βt ) of income used in the calculation
varies by year based on labor and capital costs.
Figure 3.6 plots annual growth rates in BLS data. Output grows at an average rate of 3.4 percent
over the period whereas labor and capital input grow at avearage rates of 1.5 and 3.9 percent per
year. Thus, the capital-labor ratio grows over this time period. The technology growth rate ∆A t
At is
backed out of the growth accounting equation each year based on the measured growth rates of
output and inputs as indicated below. The technology growth rate averages 1.1 percent per year
over the period.

∆At ∆Yt ∆Kt ∆Lt


= − βt − (1 − βt )
At Yt Kt Lt
The growth rate of labor and technology typically move with the output growth rate. Specifically,
times of high output growth are typically times of relatively high labor and technology growth.
A standard measure of the degree to which two series move in the same or opposite directions is
the correlation coefficient. The correlation between the growth of output and labor is 0.83 and
the correlation between the growth of output and technology is 0.82 over the period. Thus, the
technology growth rate is procyclical over the period 1949-2016. Solow found that technology was
procyclical over the period 1909-1949. The finding that technology growth is procyclical will be
important when we later consider theories of business-cycle fluctuations.
We determine the implications of the technology growth rates for the technology level using the
equation At+1 = At (1 + ∆AAt ) and by normalizing the initial technology level to A1948 = 1.0. The
t

measure of the year-by-year growth rate of technology implies that the technology level more than
doubles over the period.
While there are perhaps many questions that one could ask about the results in Figure 3.6, we
focus on two questions.
19 Thus, the hours of highly paid labor (e.g. doctors) are assigned a greater weight than the hours lower paid labor.
46 Chapter 3. Growth Theory

Figure 3.6: Growth Rates and the Technology Level

Growth Rates: Output, Labor and Technology
0.12
0.1
0.08
0.06
0.04
0.02
0
‐0.021940 1950 1960 1970 1980 1990 2000 2010 2020
‐0.04
‐0.06
‐0.08

Output Labor Technology

Technology Level
2.5

1.5

0.5

0
1940 1950 1960 1970 1980 1990 2000 2010 2020

Question 1: How can (indirectly measured) technology fall?


Answer to Question 1:
The finding that the measured technology level falls in some years calls for an explanation,
given the presumption that technology advances with time and that superior technologies are not
forgotten. We will put forward three potential explanations.
Explanation 1: (Measurement Error)
Suppose there is measurement error in labor and capital input so that the inputs to the Solow
accounting equation are composed of the true input growth rate plus some random measurement
error. One possible way of producing a fall in measured technology in a recession is then to have a
3.6 Growth Accounting 47

particular pattern of measurement errors. For example, imagine that the measured capital growth
rate is overestimated in recessions and underestimated in expansions. This would occur if the capital
utilization was particularly high in booms but low in recessions but was not reflected in measured
capital levels. Such errors would then be reflected by overestimating technology growth rates in
booms and underestimating technology growth rates in recessions simply because we subtract the
measured growth rate of the capital input in calculating technology growth rates. If plausible, then
some of the procyclicality of measured technology growth could be due to a particular cyclical
pattern of measurement errors.
Explanation 2: (Technology Adoption and Learning)
Suppose that new technologies come along and that old technologies are never forgotten. When
a firm adopts a new technology it could plausibly be the case that not all of the expertise the firm had
with the old technology transfers to the new technology and that after the firm switches technology
there is a period of learning about the new technology whereby expertise is gradually accumulated
in this technology. An implication is that output and measured firm productivity could fall after a
switch if the loss in expertise is sufficiently large. Moreover, in all periods after the switch measured
firm technology should increase due to the effects of learning in the new technology. Effectively,
the measured worker hours are of higher quality over time due to learning.
At a more aggregate level, such as at the industry or economy-wide level, the measured aggre-
gate technology could fall when there is some synchronization in the timing of technology adoption.
A version of this explanation has been advanced to explain the slower aggregate technology growth
rates inferred from US data in the decades immediately after 1974.20 The hypothesis was that
changes in computer technology in the 1980’s and 1990’s was the new technology being adopted.
Explanation 3: (Reallocation of Factor Inputs)
At a point in time firms within an industry differ both in age and in productivity. A typical
finding is that more productive firms of a given age have a higher rate of survival. There are
then at least two possible sources of aggregate productivity growth within an industry. First, the
productivity of newly entering firms improves over time and the average productivity among the
surviving firms tends to increase over time. Second, more productive firms of a given age are more
likely to survive and on balance more capital and labor is allocated over time to these surviving,
high-productivity firms.
This is a much richer view of the process governing aggregate productivity than the simplest
microeconomic theory (of identical firms with constant returns technologies) that is one theoretical
foundation for the use of an aggregate production function. Given this richer view, anything that
changes the process of either the survival of firms or the reallocation of capital and labor inputs to
more productive firms will impact the output produced from given aggregate quantities of labor and
capital. Thus, inferred aggregate productivity will vary for these reasons.
Changes in government policy (governing the firing of workers, governing competition, gov-
erning protection against foreign producers and so on) may be important sources of productivity
variation over long time periods. Changes in the ability of firms to borrow may be important for
productivity variation over shorter time periods. For example, a tightening of firm’s borrowing
limits or an increase in borrowing costs in recessions could slow down the process of allocating
more capital and labor to the most productive firms leading to a productivity growth slowdown.21
20 See Greenwood and Yorukoglu (1997), "1974," Carnegie-Rochester Conference Series on Public Policy, Elsevier,
vol. 46(1), pages 49-95.
21 See Khan and Thomas (2011), Credit Shocks and Aggregate Fluctuations in an Economy with Production Hetero-

geneity, NBER Working Paper 17311, for work highlighting implications of a change in firm’s ability to borrow. See
Foster, Haltiwanger and Syverson (2008), Reallocation, Firm Turnover and Efficiency: Selection on Productivity or
Profitability?, American Economic Review, 98, 394-425 for empirical work that relates aggregate productivity changes
to firm productivity changes and firm entry and exit.
48 Chapter 3. Growth Theory

Question 2: Is there a deeper sense in which, according to some theory, all output growth can
be attributed to technology growth even though the accounting framework and US 1909- 1949 data
says clearly that there is an 80/20 split?
Answer to Question 2:
The Solow growth model is an example of a theory that predicts that there is no long-run
growth in output per unit of labor input or in the capital-labor ratio unless there is technological
progress. This model is consistent with the main findings of applying Solow growth accounting to
US data. Specifically, the growth model predicts that when technology growth is positive that (i)
the capital-labor ratio grows and that (ii) the output-labor ratio grows. Thus, growth accounting
performed on data generated from the Solow growth model would calculate that part of output
growth is due to technology and another part is due to increases in the capital-labor ratio. However,
it is key to point out that the only reason why the capital-labor ratio grows in this theory is because
technology grows! Thus, technology growth induces growth in the capital-labor ratio within this
theory.

3.6.3 The Asian Growth “Miracle”


Economists and the general public have focused attention on the growth experience of the Asian
Tigers: Taiwan, Singapore, South Korea and Hong Kong. These countries experienced output
growth over the span of a couple of decades in excess of 6 percent per year. By almost all historical
standards, these are exceptionally high average growth rates. The key questions are (i) how did these
countries achieve these high growth rates? and (ii) what might the growth rate of these countries
look like in the next several decades? We will try to lay out some answers to these questions using
Solow growth theory and growth accounting.
The first question can be answered, at least in part, by using growth accounting. Specifically,
growth accounting can tell one what were the (proximate) sources of output growth. To do this we
will make use of the data produced in the well-known study by Alwyn Young.22 The data from
Young’s paper are reproduced here. The data cover roughly the period 1966- 1990.

Table 3.4: Asian Growth “Miracle”

Variable Hong Kong South Korea Singapore Taiwan


Labor’s Share .628 .68 .47 .71
GDP Growth 7.3 10.4 8.5 9.6
Capital Growth 8.0 13.7 11.5 12.3
Labor Growth 3.2 6.4 5.7 5.1
TFP Growth 2.3 1.6 -0.3 2.4
Source: Young (1994)

The table shows that the growth in GDP ranges from a low of 7.3 percent for Hong Kong to
a high of 10.4 percent for South Korea. The table also shows that capital growth exceeds output
growth in each country over this time period. In each country, except Hong Kong, this occurs
as the investment-GDP ratio increases over the time period. The labor growth rate is greater in
each country than the growth rate of what Young terms “raw labor”, which is a growth rate of
unweighted labor hours. The constructed series for labor effectively weights the hours of more
skilled labor more highly. Since the education levels of the workforce increase markedly from the
beginning to the end of the sample period in each country, measured labor grows more rapidly than
raw labor. Another element behind the high rates of growth of labor is that labor force participation
is increasing over the period in each country.
22 The Tyranny of Numbers: Confronting the Statistical Realities of the East Asian Growth Experience, NBER working

paper 4680, 1994.


3.7 Golden Rule 49

The findings are that average technology growth, stated in the Table as TFP (Total Factor
Productivity) growth, is 2.3 percent in Hong Kong, 1.6 percent in South Korea, −0.3 percent in
Singapore and 2.4 percent in Taiwan. Young notes in his paper that although a rate of TFP growth
on the order of 2 percent is relatively high they are not that dissimilar to TFP growth rates calculated
for some developed countries in the world over a similar time period.23 For example, the growth
rate of the technology, based on BLS data on private, non-farm business sector, from Figure 3.6
averaged 1.1 percent per year for the US over the period 1949-2016.
The bottom line of this work is that the data support the claim that the bulk of GDP growth over
the time period is due to tremendous growth rates of factor inputs and are not due to tremendously
large growth rates of TFP. This finding leads economists to say that it is not surprising that the
Asian Tigers grew at high rates given their high growth rates of factor inputs. This is just what
theory with a constant returns to scale production function would predict! What is not explained in
this growth accounting exercise is why it was the case that these were the countries that chose to
invest so heavily in human and physical capital.
The second question posed at the beginning of this section was what does economic theory
predict will happen to the growth rates of output per capita in the next several decades in these
countries. Since the growth in output was not due to usually large growth rates of technology, it
seems reasonable to use Solow growth theory to answer this question. Solow growth theory predicts
that, savings rates held constant, the growth rate of output per unit of labor input will converge to
the growth rate of the technology. Taking the US to be a country which is approximately in steady
state, this particular theory predicts that future growth rates in these countries in the next several
decades will look much more like the per capita growth rate in the US economy which has averaged
about 2 percent growth per year over the last 100 years.

3.7 Golden Rule


Within the context of the technology for production used in the Solow growth model it is natural
to try to address normative questions. Recall that normative questions deal with what should be
or what ought to be according to some set of values. Thus, a set of values allows one to describe
allocations which are “good” versus those that are “bad” in some theoretical world. This section
seeks to answer the two questions below. Early theoretical work on these issues was done by
Edmund Phelps.24

Question 1: In the context of growth theory, which allocations are clearly bad allocations?

Question 2: What are the observable implications of these bad allocations?

3.7.1 Bad Allocations


To answer the first question, let us first ask the question of which steady state of the Solow model
is the best steady state to live in. To answer this question, I will put forward the assumption that
people living in this world care only about the path of consumption over time. This is where we use
a ”set of values”. In particular, I will assume that consumption paths that have higher consumption
at each date are preferred to those with lower consumption at each date. With this assumption, the
23 Solow’s measure of technology growth has various differing labels in the literature. Sometimes it is call total factor

productivity (TFP) growth and sometimes it is called technology growth. Whatever the label, within models based upon
an aggregate production function it captures the proportionate upward or downward shift of the production function
about the current inputs.
24 See Phelps (1961), The Golden Rule of Accumulation: A Fable for Growthmen, American Economic Review, 51,

638- 43. Edmund Phelps received the Nobel Prize in 2006 partly for his work on the Golden Rule.
50 Chapter 3. Growth Theory

best steady state is then the steady state k that gives maximum consumption. Economists call this
steady state the Golden Rule steady state.
The Golden Rule steady state is easy to describe both with a graph and with simple mathematics.
First, consider the mathematics. The problem of choosing a steady state k to maximize consumption
is written in the first line below. The first term in the maximization problem is output and the
second term is steady state investment. Thus, the difference is consumption. The solution to this
problem is written in the second line below. The second line notes that the maximum should have
the property that there is no gain (in consumption) to having a little more or a little less capital.
Thus, the derivative or slope of the first line should be precisely zero at the Golden rule capital-labor
ratio.

Max F(k, 1) − k[(1 + g)(1 + n) − (1 − δ )]

⇒ Fk (k, 1) − [(1 + g)(1 + n) − (1 − δ )] = 0

This situation is graphed in Figure 3.7. The Golden Rule steady state kGR occurs at the capital
level k where the distance between the production function and the steady state investment line
is greatest. Geometrically, this can be determined by shifting the steady state investment line
up vertically until the line is just tangent to the production function. The Figure highlights this
geometric description of the Golden Rule steady state. Note that the geometry amounts to the claim
that the slope of the production function equals the slope of the steady state investment line.
We are now ready to answer Question 1. The answer is that any allocation where the sequence
of capital stock always remains strictly above the Golden Rule steady state capital stock is a bad
allocation. The reason why such an allocation is bad is that one can come up with a feasible
alternative allocation that allows for comparatively more aggregate consumption in all periods.
To be concrete, assume that the economy is at a steady state above the level kGR . Then there is
a “free lunch” that can be had simply by decreasing the capital stock to the Golden Rule level and
maintaining it there forever. Clearly, this is possible since consumption at the Golden Rule is larger
than at any capital level above the Golden Rule. In summary, any steady state above the Golden
rule steady state is bad since, paradoxically, the economy suffers from having too much investment.

3.7.2 Observable Implications of Bad Allocations


Now that we have a theory describing which allocations are “bad” it is natural to ask what are the
observable implications of these bad allocations. This might allow us to say whether or not actual
economies suffer from being “above the Golden Rule”. To do this, consider the four equations
below. Each of these is a simple rewriting of the first equation below which says that the capital-
labor ratio k is above the Golden Rule level. The first equation follows from the equation defining
the Golden Rule capital stock or, alternatively, from Figure 3.7. This equation is based on the
geometry in Figure 3.7 in that the slope of the production function is smaller than the slope of the
straight line defining steady-state investment.

Fk (k, 1) < [(1 + g)(1 + n) − (1 − δ )] (3.37)


1 + Fk (k, 1) − δ < (1 + g)(1 + n) (3.38)
kFk (k, 1) < k[(1 + g)(1 + n) − (1 − δ )] (3.39)
k(Fk (k, 1) − δ ) < k[(1 + g)(1 + n) − 1] (3.40)

These equations are useful as they have simple interpretations in terms of observables. The
second equation can be interpreted as stating that the gross interest rate (i.e 1 + r ≡ 1 + Fk (k, 1) − δ )
3.7 Golden Rule 51

Figure 3.7: Golden Rule Steady State

is less than the steady state growth rate of aggregate output (i.e (1 + g)(1 + n)).25 Both of these
quantities can be measured. Thethird equation says that aggregate payment to capital kFk (k, 1) is
less than aggregate investment k[(1 + g)(1 + n) − (1 − δ )]. Once again, each of these quantities can
be measured. The fourth equation says that aggregate net payment to capital is less than aggregate
net investment.
These interpretations were related to data in a well-known paper by Abel, Mankiw, Summers
and Zeckhauser (1989).26 They first note that relating the gross interest rate to the gross growth
rate of output is problematic. The reason that this is problematic is that there are many interest
rates and returns that can be calculated from data in actual economies. For example, one could
choose the average real interest rate on US Treasury Bills or, alternatively, the average real return
on the US stock market. The average real return on Treasury Bills and Treasury bonds are about 1
and 2 percent, respectively, and the average real return on the US stock market is about 6 percent

25 Recall that in a steady state of the Solow model output grows at a gross rate which is approximately equal to the
population growth rate plus the growth rate of the technology.
26 Abel et. al. (1989), Dynamic Efficiency: Theory and Evidence, Review of Economic Studies, Volume 56, 1-20.
52 Chapter 3. Growth Theory

over long time periods.27 One of these returns is larger than the 3 percent average growth rate of
real output in the US over long time periods and the other two are smaller. Thus, using average
returns one could conclude either that the US economy is well above the Golden rule or well below,
depending on which asset one chooses to look at!

Figure 3.8: Investment and Payment to Capital in the US

US Data 1929! 85
0.4
Fracttion of GNP

0.3
0.2
0.1
0
1920 1930 1940 1950 1960 1970 1980 1990
Year

Investment/Y Payment to Capital/Y

The problem with the second equation is evidently that the model is too simple. Treasury bills
and stock differ enormously in risk characteristics and, as a result, have different average returns.
The theory abstracts from risk, has a single real interest rate and therefore provides no help in
deciding which asset return to use and how to use it. To respond to this issue one needs a theory
that incorporates risk. While this type of analysis is done in the literature it is too advanced for a
useful discussion at the level of this book.
Abel et. al. (1989) argue that the third and fourth equation above can be related to data in a
manner which does not lead to ambiguity. Following the discussion above, they compute the gross
payment to capital and the gross investment in the US as a ratio to GNP. These are empirical proxies
for the underlying theoretical concepts in the third equation above. Some of the empirical results of
their paper for the payment to capital and investment as a ratio to GNP are contained in Figure 3.8.
They find that the gross payment to capital is always well above gross investment in the US.
Their measure of gross payment to capital varies from a low of about 23 percent in 1945 to a
high of 32 percent in 1929. By comparison, gross investment varies from a low of 1.9 percent
in the Great Depression to a high of 19 percent in 1950. Thus, investment is always below the
payment to capital. This pattern also holds for a number of European countries plus Japan. Based
27 See Jeremy Siegel (2002, Table 1.1 and 1.2) "Stocks for the Long Run" Third Edition, McGraw Hill.
3.8 Key Concepts 53

on this evidence, Abel et. al. (1989) conclude that the advanced economies appear to all be below
the Golden Rule. Thus, there appears to be no free lunch to be had from growth theory. Stated
differently, the advanced economies of the world may have many problems but one problem that
they do not suffer from is having accumulated too much physical capital.

3.8 Key Concepts


A production function Y = AF(K, L) has constant returns to scale provided that when all
factor inputs are scaled up or down by a common factor, then output is also scaled by the
same factor.
The marginal product of an input is the extra output produced by a small increase in that
input, other things equal.
A production function Y = F(K, L) has diminishing marginal products provided that the
marginal product of capital (labor) falls as the quantity of capital (labor) is increased, holding
other inputs constant.
The Golden Rule steady state is the steady-state capital-labor ratio that produces the highest
steady-state consumption level.
4. Dynamic Consumer Theory

We review some of the main points of consumer theory which are relevant to decision making over
the life cycle. Much of this material should be familiar from an introductory or intermediate level
microeconomics course. This chapter assumes that the student has a basic knowledge of the theory
of preferences, utility and consumer demand from previous course work.
First, the two-good problem from standard static consumer theory is presented. Second, the
two-period problem in dynamic consumer theory is presented. The main result is that static and
dynamic theory are the same after reinterpreting prices and income. Third, we show how to extend
the dynamic consumer theory framework to arbitrarily many periods. The multi-period framework
is helpful for interpreting data on yearly consumption expenditures and follows without too much
effort from the analysis of the two-good problem.

4.1 Static Consumer Theory: Two Good Case


The problem studied in static consumer theory (for the case of two goods) is that of a consumer who
maximizes a utility function U(c1 , c2 ) by choosing consumption (c1 , c2 ) subject to the restriction
that consumption lies in a budget set. The budget set is defined by p1 c1 + p2 c2 ≤ I.1
An interpretation of this problem is as follows. A consumer with I dollars of income walks
into a grocery store. There are two goods (lemonade and sunscreen) with posted prices p1 and p2 .
The consumer can choose any combination of these two goods but cannot spend more than his/her
income I. The consumer makes the best choice among those that are affordable.
The typical graph of this situation is provided in Figure 4.1. Indifference curves represent
different combinations of the two goods providing the same level of utility. Here it is assumed that
indifference curves located further northeast lead to strictly higher utility. Clearly, if more of any
good is always better, other things equal, then such an assumption will produce indifference curves
with this property. This assumption will also produce indifference curves which are downward
sloping.
1 c and c denote the consumption of goods 1 and 2, whereas p and p denote the corresponding prices. I denotes
1 2 1 2
income.
56 Chapter 4. Dynamic Consumer Theory

Figure 4.1:

LEMONADE
p1
Slope = -
W p2
p2
a c2
Slope = -
1 - a c1

c2
INDIFFERENCE
CURVE
BU
D
G
ET
LI
N
E
c1 W SUNSCREEN

p1

A solution (c1 , c2 ) to this problem satisfies two conditions: (i) the slope of the indifference
curve (which economists call the marginal rate of substitution (MRS)) at (c1 , c2 ) equals the slope
of the budget line and (ii) (c1 , c2 ) lies on the budget line so that the consumer spends all of his/her
income (denoted W rather than I in Figure 4.1). These two conditions are stated below.

p1
MRS(c1 , c2 ) = − (4.1)
p2
p1 c1 + p2 c2 = I (4.2)

If one were given a specific utility function U(c1 , c2 ), then one could calculate the marginal
rate of substitution function MRS(c1 , c2 ) implied by U and proceed to solve these two equations.
The result would be a system of two Marshallian demand equations stating the best choices of
each good as functions of prices (p1 , p2 ), income I and parameters describing the utility function.
Here we use the result that MRS(c1 , c2 ) = −U1 (c1 , c2 )/U2 (c1 , c2 ), where U1 and U2 denote partial
derivatives of the utility function with respective to goods 1 and 2.2 These derivatives describe the
marginal utilities of consuming an extra unit these goods.
Consider the case where U(c1 , c2 ) = cα1 c1−α
2 or, equivalently, where U(c1 , c2 ) = α log(c1 ) +
(1 − α) log(c2 ). For this utility function the marginal rate of substitution MRS(c1 , c2 ) and the
Marshallian demand functions describing best choices are as follows:
2 I will present it for those who are interested in where it comes from.
Step 1: Define an indifference curve with the equation U(c1 , c2 ) = constant.
Step 2: We want to find out how c2 changes as c1 changes along an indifference curve. So let one variable be a
function of the other: U(c1 , c2 (c1 )) = constant.
Step 3: Differentiate using the chain rule: U1 +U2 dc2 (c1 )/dc1 = 0.
Step 4: Rearrange to get the result: dc2 (c1 )/dc1 = −U1 /U2 .
U (c ,c )
Step 5: Thus, MRS(c1 , c2 ) = − U1 (c1 ,c2 ) .
2 1 2
αc2
Step 6: When U(c1 , c2 ) = α log(c1 ) + (1 − α) log(c2 ), then MRS(c1 , c2 ) = − (1−α)c because U1 (c1 , c2 ) = α/c1 and
1
U2 (c1 , c2 ) = (1 − α)/c2 .
4.2 Dynamic Consumer Theory: Two Periods 57

I
c1 = α (4.3)
p1
I
c2 = (1 − α) (4.4)
p2
U1 (c1 , c2 ) α c11 α c2
MRS(c1 , c2 ) = − =− 1
=− (4.5)
U2 (c1 , c2 ) (1 − α) c 1 − α c1
2

The interpretation is that the consumer always spends a constant fraction of income I on each
good regardless of prices. The preference parameters α and (1 − α) determine these fractions.
Another feature of the optimal consumption is that when income increases or decreases, prices held
constant, then consumption of each good moves exactly proportionally to income. This is a special
case of all goods being “normal” goods where the income elasticity is precisely equal to 1. Lastly,
the demand curve traced out by varying the price of each good, other things equal, is downward
sloping.

4.2 Dynamic Consumer Theory: Two Periods


The problem studied in dynamic consumer theory is that of a consumer who maximizes a utility
function U(c1 , c2 ) subject to the restriction that consumption lies in a budget constraint.3

Figure 4.2:
CONSUMPTION
PERIOD 2

Slope = -(1 + r )
W (1 + r )
a c2
Slope = -
1 - a c1
w2
c2
INDIFFERENCE
CURVE
w2

c1 w1 W
PERIOD 1
CONSUMPTION

The dynamic theory differs from static consumer theory in two main ways. The first is that
the interpretation of a good differs. Good 1 is now the consumption good in time period 1,
3 In the early 1900’s Irving Fisher considered the two-period problem that we analyze. In the 1950’s Modigliani and

Brumberg analyzed many period versions of the Fisher model. Modigliani later received the Nobel prize in 1985, in part,
for this work. In the 1950’s Milton Friedman considered a version of this model where labor income is risky. Friedman
received the Nobel prize in 1976 for this work. The standard theory that present-day economists use to think about
consumption and savings behavior is a result of this line of research.
58 Chapter 4. Dynamic Consumer Theory

whereas good 2 is now the consumption good in period 2. Thus, there is only one good per time
period. This difference means that we will be assuming that consumers make optimal consumption
plans over a two-period lifetime rather than just at a point in time. We will assume that there is
no uncertainty about future labor income and that the consumer correctly forecasts future labor
income. Thus, we assume that any consumer is both forward thinking and an optimizer. While
real-world consumption-savings problem are made when there is substantial risk related to future
labor earnings, we abstract from this risk.
The second difference is that the budget constraint is written differently. The budget set is
defined by the two equations below. The terms w1 and w2 refer to wages (labor earnings) received
in period 1 and 2 respectively. The term a2 is asset holding carried from period 1 into period 2 and
r is the real interest rate. As there is no risk in the model this is a risk-free real interest rate.

c1 + a2 ≤ w1 c2 ≤ a2 (1 + r) + w2
The typical graph of this dynamic utility maximization problem is presented in Figure 4.2. In
this graph the present value of labor income is denoted W . We use the convention that W = I =
w1 + w2 /(1 + r). In Figure 4.2, the consumer can consume more in period 1 than labor income in
period 1 (i.e. c1 > w1 ) if he/she wants to do so. The maximum possible period 1 consumption is
w2
achieved by borrowing against period 2 labor income. The maximum that can be borrowed is 1+r
as then the consumer could use all of period 2 labor income to pay back this borrowing and the
w2
associated interest costs. Sometimes economists say that 1+r is the present value of w2 units of the
consumption good tomorrow in terms of units of the consumption good today. The consumer also
has the possibility of consuming nothing today but consuming a quantity w2 + w1 (1 + r) tomorrow.
This is achieved by saving all of the labor income w1 today and converting this into w1 (1 + r) goods
tomorrow by purchasing the financial asset that pays a risk-free real interest r.
Figure 4.2 is qualitatively the same as Figure 4.1 in that it has the same geometry. Thus, at
a mathematical level both static and dynamic consumer theory must be the same. The reason
why the graph for the static and dynamic consumer theory problems are the same is that the
two budget restrictions from dynamic consumer theory are equivalent to the budget constraint
p1 c1 + p2 c2 ≤ I. To see this mathematically, just add together each of the two period budget
1
constraints after multiplying the constraint of period 2 by 1+r . This multiplication effectively
brings the quantities of period 2 to present (or period 1) value. The asset term then drops out. These
two equations are listed in the definition below.

Definition 4.2.1 The Present-Value Budget Constraint of a Two-Period utility maximization


problem dictates that the present value of consumption is no more than the present value of labor
income.
c2 w2
c1 + ≤ w1 +
1+r 1+r
These constraints are algebraically obtained by adding together the one-period constraints of
period 1 and period 2. These are listed below.

c1 + a2 ≤ w1
1 1
c2 ≤ (w2 + a2 (1 + r))
1+r 1+r
The budget constraint of the dynamic utility maximization problem can be seen as a reinterpre-
tation of the static utility maximization problem, where we reinterpret the two consumption goods
as consumption of a single good in periods 1 and 2 and we reinterpret wealth as the present value
1 w2
of income flows. To see this formally set prices as p1 = 1, p2 = 1+r and income as I = w1 + 1+r .
Each of these three terms has a simple interpretation. p1 and p2 are the present value prices of
4.3 Dynamic Consumer Theory: Many Periods 59

consumption in periods 1 and 2 stated in units of the period 1 consumption good. These prices state
how many time 1 goods are needed to purchase one unit of the time t = 1, 2 good.
Present value is a familiar concept from introductory economics and finance courses. In the
dynamic utility maximization problem, the present value of future labor income, for example, can
be interpreted as the maximum amount of borrowing that could be fully repaid (with interest) over
the life of the consumer by using all future labor income to pay off this borrowing. I is the present
value of current and future labor income. It equals current labor income w1 plus the present value
of future labor income w2 .
We can write the solution to the dynamic consumer problem, using the corresponding solution
to the static consumer theory problem. For example, when the utility function is U(c1 , c2 ) = cα1 c1−α
2
or is U(c1 , c2 ) = α log(c1 ) + (1 − α) log(c2 ) then the solution to the utility maximization problem
is given below. We also write down the optimal savings decision a2 which is simply labor income
less consumption. Note that the optimal savings decision is backed out from the budget constraint
once one knows optimal consumption behavior. Thus, optimal savings behavior is determined from
optimal consumption behavior and the nature of budget constraints.

Theorem 4.2.1 If a consumer solves the two period utility maximization problem of choosing c1 ,
and c2 to maximize U(c1 , c2 ) = α log(c1 ) + (1 − α) log(c2 ) subject to the present-value budget
constraint, then the consumer’s behavior is as follows:
I
c1 = α = αI
p1
I I
c2 = (1 − α) = (1 − α) 1
p2 1+r
w2
I = w1 +
1+r
a2 = w 1 − c1

The decision rules say that the consumer will spend fractions α and 1 − α of present value of
w2
income I in period 1 and period 2 consumption, respectively. I = w1 + 1+r is the relevant notion
of income in the two-period model. The price p1 = 1 because the value of 1 unit of the time one
good in units of the time 1 good is clearly 1. The price of the time 2 good in units of the time 1
good is p2 = 1/(1 + r) .

Proof. First, the consumption plan is in the budget set. We know this because calculating the
present value of the consumption plan gives I. This means that the plan is affordable and that no
α c2
resources are “thrown away”. Second the slope of the indifference curve, which is − 1−α c1 equals
the slope of the budget line, which is −(1 + r) when the consumption plan (c1 , c2 ) is dictated by the
solution above. Any other consumption plan that has a present value of I will have a marginal rate
of substitution not equal to the slope of the budget line and, thus, total utility could be increased by
a small movement along the budget line for any other such plan. Thus, the only candidate solution
that has not been ruled out is the proposed solution. 

4.3 Dynamic Consumer Theory: Many Periods


We will now generalize the model so that a lifetime has n periods instead of just two. Consider a
utility function that depends on the consumption flow in each of the n periods of life U(c1 , c2 , ..., cn ).
In particular consider the following functional form for the utility function:
U(c1 , c2 , ..., cn ) = α1 u(c1 ) + α2 u(c2 ) + ... + αn u(cn )
U(c1 , c2 , ..., cn ) = α1 log(c1 ) + α2 log(c2 ) + ... + αn log(cn )
60 Chapter 4. Dynamic Consumer Theory

According to this functional form, the overall utility of a consumption profile c1 , c2 , c3 , ..., cn is
a weighted sum of period utilities derived in each period. The parameter α j is the weight of the
utility u(c j ) derived by the consumer in period j in the consumer’s overall utility. The weights α j
are numbers between zero and one that add up to one. These weights help put the utils from each
period j in terms of utils of period 1. A common assumption is that the weights decline with j and
reflect that the consumer is impatient. The second line says that the utility derived from c j in a
given period is given by the log of c j . In other words, the period utility function is chosen to be
u(c j ) = log(c j ).
The budget constraint is described by n inequalities. Each inequality describes the budget
constraint in a model period. Each of the period constraints says that the resources devoted to
consumption and savings (on the left hand side) cannot exceed the resources obtained from labor
income, savings from last period and the interest received on savings from last period. Notice that
savings can be negative (a j < 0). This happens when the agent takes out a loan. These inequalities
are provided below. Note that the first and last period differ in that by assumption the agent is born
in period 1 with no assets and in that in the last period the agent is not allowed to take out new
loans and undertakes no savings.

c1 + a2 ≤ w1
c2 + a3 ≤ w2 + a2 (1 + r)
c3 + a4 ≤ w3 + a3 (1 + r)
...
cn ≤ wn + an (1 + r)

We can apply the same algebra step used in the two-period case to convert these n period-
by-period budget constraints into a single present value budget constraint. This can be done by
multiplying the period j constraint by 1/(1 + r) j−1 (the present value price of period j consumption
in terms of period 1 goods) and then adding all the constraints together. The result is the present
value budget constraint below. Here, as in the two period case, we are abstracting from taxes and
transfers (from government and family) and any initial financial wealth. Government taxes and
transfers can easily be analyzed by interpreting w j to be the net labor income received in period j
after government taxes and transfers.

Definition 4.3.1 The Present-Value Budget Constraint of the n-period utility maximization
problem states that the present value of the consumption flows of each of the n periods cannot
exceed the present value of the income flows received in each of the n periods.
c2 cn w2 wn
c1 + + ... + ≤ w1 + + ... +
1+r (1 + r)n−1 1+r (1 + r)n−1

When the utility function is specialized to be U(c1 , c2 , ..., cn ) = α1 log(c1 ) + α2 log(c2 ) + ... +
αn log(cn ), then we can write down the demand functions describing best choices lieing on the
budget constraint.

Theorem 4.3.1 If a consumer solves the n-period utility maximization problem of choosing
c1 , c2 , ...cn , to maximize U(c1 , c2 , ..., cn ) = α1 log(c1 ) + α2 log(c2 ) + ... + αn log(cn ) subject to
the present-value budget constraint, then the consumer’s behavior is characterized by the follow-
4.3 Dynamic Consumer Theory: Many Periods 61

ing n decision rules:

c1 = α1 I
...
I
cj = αj 1
(1+r) j−1
...
I
cn = αn 1
(1+r)n−1

The decision rules say that the consumer will spend a fraction α j of income I on period j
w2 wn
consumption, where income is I = w1 + 1+r + ... + (1+r) n−1 . The asset holding behavior is

determined residually from the period by period budget constraints as follows: a j+1 = a j (1 +
r) + w j − c j given a starting value a1 = 0 and given consumption.

Although we do not provide a proof, the logic of Theorem 4.3.1 is similar to that of Theorem
4.2.1. First note that, according to the optimal plan, the present value of consumption over
the life time is equal to I. This implies that the consumption plan is affordable and that no
resources are “thrown away”. Second one can verify that, at this proposed solution, across
any two neighboring periods the marginal rate of substitution (slope of indifference curve) is
α j c j+1
MRS(c j , c j+1 ) = −U j (c1 , ..., cn )/U j+1 (c1 , ..., cn ) = − α j+1 c j and equals −(1 + r). Thus, the rate
at which the agent is indifferent to trading these goods is exactly the rate at which the financial
market allows him or her to trade these goods.

4.3.1 Lifetime Histories

At this stage it is helpful to try to develop a graphical sense of the consumption and savings behavior
that this model is capable of producing. Here we will consider a case which is simple. To be
specific, let the agent live for n = 60 model period. Think of these as covering ages of 21 to 80 in
real life. Assume that labor income w j is equal to 40 thousand dollars before retirement and zero
afterwards. Assume that consumers retire at age 62 (that is, on period j = 41). Let the real interest
rate be zero (i.e. r = 0) each model period and let all the weights on period utility α j = 1/n = 1/60
be the same each period.
Under these assumptions, the demand functions above imply that consumption is “flat” over the
lifetime in the sense that it is the same each model period. This follows mathematically from the
demand equations above. Specifically, the denominator is the same in each demand function when
the interest rate is zero and the numerator is the same in each equation. This combination of interest
rate and preference parameter assumptions produces a desire for a smooth or a flat consumption
profile over the lifetime.
It is clear that increasing the interest rate, holding preference parameters constant, will lead to
an upward sloping profile of consumption over the lifetime as future consumption is now cheaper
in present value terms. Thus, the growth rate of consumption would be positive across neighboring
periods. We can calculate how responsive the growth rate of consumption is to a change in the real
interest rate using the equations above. It would be useful to express this in terms of the percentage
increase in the consumption growth rate to a percentage change in the interest rate. A quick bit of
algebra indicates that a one percent increase in the gross interest rate (1 + r) leads to a one percent
62 Chapter 4. Dynamic Consumer Theory

increase in the (gross) growth rate of consumption (c j+1 /c j ).4 This holds when the period utility
function is u(c j ) = log c j .

4.4 Some Uses Of The Model


The multi-period model of the previous section has many uses. In this section we will assume that
consumers are making consumption and saving choices to maximize a utility function. We will
consider the following three issues using this framework:
1. It is well known that the profile of household consumption over the life cycle is hump-shaped
with peak consumption occurring around age 50. Why is consumption hump-shaped?
2. Some changes in household income are largely permanent (e.g. a permanent rise in labor
income due to a promotion) whereas others are largely temporary (a bonus paid for temporar-
ily working overtime hours, lottery winnings, large capital gains or losses on stock holdings
this year). What does theory say about how the response to unanticipated temporary versus
permanent changes differs? What does the data say about these responses?
3. It is well known that in US cross-section data average household savings rates increase
strongly as household income increases. It is also the case that over (some) long time periods
the aggregate savings rate in the US did not change much despite large changes in income.5
Does the multi-period model imply both of these observations?
The next three subsections will take a shot at answering each of these three questions. It should
be clear that to answer such questions one needs a theory for how individuals behave.

4.4.1 Consumption Patterns Over The Life Cycle


A common finding from empirical work is that household consumption is hump-shaped over the
life cycle so that household consumption is largest in the middle of the life cycle. This pattern arises
when one examines the average consumption by age of a large collection of households whose
household heads share the same year of birth.
Figure 4.3 plots U.K. consumption and income data for two education groups - a lower
educational group (labeled compulsory) and a higher group (labeled post compulsory).6 The
separate plots in this figure correspond to separate birth cohorts (all those households with household
heads born in the same year). Any one of the connected lines in Figure 4.3 plots the average real
consumption or income for a specific bith cohort over the sample period.
Both the consumption plots and the income plots have a hump-shaped pattern. Part of this hump-
shaped pattern in consumption is eliminated when one tries to adjust for differences in household
size over the life cycle - say by dividing household consumption by the number of household
members. Nevertheless, even when an adjustment is made for household size, a hump-shape pattern
still remains.
Why is consumption hump shaped? One way to see the difficulty in answering this question
is to consider something that does not work. One idea is that household labor income is strongly
hump-shaped. So if households simply consumed their labor income this would produce a hump.
The problem is that our theory is one of rational households. They forecast future labor income
to calculate how much they are worth (i.e. the present value of lifetime income) and based upon
this they make their best choices. The reader will recall the optimal consumption equation derived
earlier. This tells us that within this theory the timing of the receipt of labor income is not important
α j+1
d log(c /c ) d log (1+r)
4 The j+1
calculation is as follows: d log(1+r) j j α
= d log(1+r) = 1.
5 These observations were important historically for the development of modern dynamic consumer theory. They

were central in work in the 1950’s by Friedman, Modigliani, Kuznets and others.
6 The figure is taken from Attansio and Weber (2010), Consumption and Saving: Models of Intertemporal Allocation

and Their Implications for Public Policy, Journal of Economic Literature, vol. 48, pages 693-751.
4.4 Some Uses Of The Model 63

Figure 4.3: Cohort Consumption and Income Profiles

Cohort profiles Cohort profiles


Compulsory Post-compulsory
1000
500
0

20 40 60 80 20 40 60 80
Age of Head
Income Consumption
Graphs by educ

Figure 2 Average income and consumption by cohort and education

for choosing how much to consume each period. Instead, what matters is the present value of
income I, the preference parameters α j , the market interest rate r and nothing else. It is significant
that the theory uses the assumption that consumers can borrow against future income.
Consider a special case of values of the preference parameters in relation to the interest rate.
Suppose that the preference parameter α j governing the importance of period j consumption are
proportional to the present value price of period j consumption α j = γ/(1 + r) j−1 for some positive
value γ. If so, then the equation for optimal consumption behavior c j = α j × I × (1 + r) j−1 implies
that consumption profiles over the life cycle are “flat” or constant.
This does not work either! Ah, but it does tells us something that does work. If the preference
parameter α j is hump shaped over the lifetime in that α j is highest around age 50 and if the real
interest rate is approximately zero, then this theory produces a hump shaped consumption profile
c j = α j × I × (1 + r) j−1 . This is not a very impressive explanation as one could explain any age
pattern in the data by an appeal to a similar pattern in the (unobserved) preference parameters.
It is also not the full explanation that present-day economists offer. It does however explain the
observations. If one wanted to test the theory, then one would need some observations to estimate
preference parameters and some additional observations that could be used to test the theory.

4.4.2 Consumption Responses to Temporary vs Permanent Shocks


What does the theory say about the consumption and savings in response to a temporary versus a
permanent increase in labor income? To be specific, let us imagine that an agent has 50 years to
live. Each model period is a year. To make life easy assume that α1 = α2 = ... = α50 = 1/50 and
that the real interest rate is zero each period (i.e. r = 0).
Let us consider the following 3 situations:
1. Situation 1: (Baseline Case) Labor income is flat over the life cycle and equal to 10 in each
period.
64 Chapter 4. Dynamic Consumer Theory

2. Situation 2: (Temporary Increase) Labor income is 20 in the first period (i.e. w1 = 20) but
equals 10 in all other periods (i.e. w j = 10, ∀ j ≥ 2).
3. Situation 3: (Permanent Increase) Labor income is 20 in each period.
The theory just described implies that consumption is flat over the life cycle. In Situation 1
consumption is equal to 10 each period. In Situation 2 consumption is equal to 10.2 each period. In
situation 3 consumption is again flat and equal to 20 in all periods. The numbers for consumption
are derived by plugging in the data for labor income, the interest rate and the preference parameters
into the function describing best choices that was presented earlier. The function describing the
best choice in period j is restated below:

[w1 + w2 /(1 + r) + ... + wn /(1 + r)n−1 ]


cj = αj
1/(1 + r) j−1
What are some of the messages that these three examples highlight about the theory? A first
message is that a “temporary” increase in labor income (compare situation 1 to situation 2) in the
first period is largely saved. Consumption increases each period by 2 percent even though labor
income in period 1 increases 100 percent. A way to talk through the logic of this example is to
say that when there is an increase in the present value of labor income then the consumer wants
to increase the consumption of all normal goods. Since consumption in each model period is a
normal good in the example, then the only way to increase consumption in future periods is by
accumulating financial assets.
A second message is that a permanent increase in labor income (compare situation 1 to situation
3) is largely consumed. We can talk through the logic of this example as follows. The consumer
sees that first period income is 20. If this were a temporary increase then it would be largely saved.
However, if there are good reasons to anticipate that the increase is permanent, then the agent will
increase consumption in all periods by 100 percent as lifetime income (the present value of current
and future labor income) has increased by 100 percent.
Due to the big difference in the response to temporary versus permanent changes in current
labor income predicted by this theory, economists have tried to see whether or not there is support
in data for this. Thus, economists try to find situations where there are surprise temporary increase
in income (e.g. lottery winnings) to see if the increase in consumption following such increases are
quite different than the increases in consumption following episodes where income can be argued
to have increased permanently. Broadly, the literature finds evidence for substantially smaller
consumption responses to temporary income shocks compared to the same size permanent income
changes.7 Such findings are good news for theory which extends the type of model considered in
this chapter to situations that explicitly allow for risk.
However, the findings from the empirical literature suggest that the most extreme versions of
the optimizing theory presented in this chapter will not be consistent with other data. For example,
economists have examined the consumption response to predictable future income increases that
arise from (1) US federal income tax rebates in 2001 or (2) checks issued to Alaska residents from
the Alaska Permanent Fund. The model from this chapter highlights that the precise timing of the
receipt of predictable future income should not affect consumption profiles but that the present
value of this income and an agent’s preferences are important determinants of this profile. For
the case of income tax rebates (a typical rebate was $300 or $600), there was an increase in a
household’s nondurable goods expenditures after the rebates were received. The increase was
largest for households with low financial wealth and low income. This is suggestive that households
face borrowing or liquidity constraints that are not part of the theoretical model considered in this
chapter. Jappelli and Pistaferri (2010) review this work.
7 This literature is reviewed by Jappelli and Pistaferri (2010) , The Consumption Response to Income Changes,

Annual Reviews of Economics, 479- 506.


4.4 Some Uses Of The Model 65

4.4.3 Savings Rate Observations


Now let us try to understand how it could be the case that a version of the model just discussed
could produce a pattern of average savings rates whereby high income households in cross-section
data have a higher savings rate than lower income households. As mentioned earlier, this is the
pattern found in U.S. data.

Figure 4.4: Humped Earnings Imply a Hump-Shaped Wealth Profile

Let us use the simple model from the last subsection to get some understanding of how the
model might imply this behavior. Assume that the utility function parameters and that interest rates
are such that optimal consumption profiles are approximately flat. Assume also that labor income
is strongly hump shaped as it is in U.S. data. A stylized graph of this situation is given in Figure
4.4. Note that asset holdings must be consistent with the equation: a j+1 = w j + a j (1 + r) − c j . This
effectively implies that hump shaped labor income w j and flat consumption c j imply that asset
holding has a hump.
The upshot, as far as savings rates are concerned, is three-fold: (1) savings rates are low early
in life as income is temporarily low, (2) savings rates are high in middle of the life cycle as income
is temporarily high and (3) savings rates are low late in life as income is low. This explanation
highlights that low income households dissave and they are typically young and old agents, whereas
high income households typically save as they are middle age agents experiencing high earnings
and anticipate low earnings later in life.8
8 Milton Friedman suggested a complementary mechanism to produce the savings rate observations: high income
households have a high fraction with a positive-but-temporary earnings shock, whereas low income households have a
high fraction with negative-but-temporary earnings shock. Optimal smoothing behavior dictates saving part of a positive
66 Chapter 4. Dynamic Consumer Theory

4.5 Overview
This chapter uses standard consumer theory from introductory microeconomics to develop a theory
of consumption and savings decisions over the lifetime. The theory takes the view that consumers
have perfect foresight over future labor market earnings and future interest rates and make best
lifetime plans. The perfect foresight assumption may not seem very realistic, but it does extend
standard consumer theory to apply to lifetime choices. The dominant framework for thinking
about consumption and savings decisions used by current-day economists is a much more elaborate
version of this simple theory derived from the work of Fisher, Modigliani and Friedman among
others.
Perhaps the most critical way that modern theory differs from the simple theory given in this
chapter is that modern theory allows for labor income risk, both due to economy-wide risk as
well as individual-specific risk, and for some market imperfections. The market imperfections
considered by modern theory include the lack of some financial markets that could be used to insure
or hedge important components of labor income risk as well as borrowing constraints that impede
the possibility of using future income to finance current consumption.

4.6 Key Concepts


An indifference curve for an agent passing through the point (c∗1 , c∗2 ) is the set or collection
of all allocations (c1 , c2 ) that give the agent equal utility: U(c1 , c2 ) = U(c∗1 , c∗2 ).
The marginal rate of substitution is the slope of the indifference curve at a point.
The marginal rate of substitution at a point also measures the rate at which a consumer
can substitute one good for another and stay on the same indifference curve.
When a utility function has positive marginal utilities then the marginal rate of substitu-
tion at point (c1 , c2 ) can be calculated using the ratio of marginal utilities: MRS(c1 , c2 ) =
U1 (c1 ,c2 )
−U 2 (c1 ,c2 )
. The MRS is negative because the indifference curve is downward sloping.

shock but dissaving part of a negative shock.


5. Life-Cycle Model

This section describes a version of the Solow growth model but with optimizing consumers. We
will call this model the “life-cycle model”. This label stresses the fact that the consumers that live
in this model economy pass through life-cycle stages in that they are born and will later die. This
will be relevant for the issue of which consumers in the model will want to buy physical capital and
which ones will want to sell physical capital.
The production technology in the life-cycle model is the same as in the Solow model. Thus,
output is produced by a constant returns technology using capital and labor inputs. Furthermore,
marginal products are diminishing. This means that many of the insights from growth theory will
hold within the life-cycle model. For example, three things will hold: (i) steady-state growth will
occur only if there is growth in technology, (ii) output growth will equal the sum of technology
growth and weighted input growth and (iii) there is the possibility that too much capital may be
accumulated within the model so that the model economy may potentially have a capital-labor ratio
greater than the Golden rule capital-labor ratio.
The life-cycle model assumes that consumers maximize utility as is standard in microeconomics.
Adding optimizing consumers is important in a number of ways. First, and most obviously, the
savings rate of the economy will respond to policy changes and to shocks impacting the economy.
This was not true in the Solow model as the savings rate was exogenous. If one is sympathetic to
the view that people respond to incentives, then adding optimizing consumers is certainly consistent
with such a view. Second, one can ask whether or not the functioning of the model economy can
be improved. Economists find it natural to ask whether improvements in the model economy are
possible in the Pareto sense (i.e. is there an alternative feasible allocation that increases at least one
person’s utility without lowering anyone’s utility). If the individual’s populating a theoretical model
have no clear preferences over things they end up choosing, then it is not so clear how to evaluate
whether the functioning of the economy can be improved. However, when the individuals do have
clear preferences, then it is natural to use these preferences and the Pareto criteria to determine if
the functioning of the economy can be improved.
The life-cycle model will be extremely simple. The simple structure of the model leads us to
use the model to gain qualitative insight but not quantitative insight. At a later stage, we will use
68 Chapter 5. Life-Cycle Model

the model to get qualitative insight into the effects of (i) a one-time increase in the population, (ii)
an increase in government war expenditures that are financed in different ways, (iii) a temporary
tax cut, (iv) the adoption of an unfunded social security system and (v) a temporary or permanent
change in the technology. The simple structure will lead to clear insights. Thus, the model provides
qualitative answers to traditional questions related to how the economy responds to shocks and to
fiscal policy changes.

5.1 Benchmark Model


The key ingredients of the Benchmark Model are listed below:
1. Demographics: N young agents are born at each time period t = 1, 2, .... Agents live two
periods. These two assumptions imply that at any time t there are N young agents and N old
agents. The exact numerical value of N will not be important as we will typically focus on
ratio variables such as GDP per worker.
2. Preferences: Young agents care only about consumption when young and consumption
when old according to a utility function U(cy , co ). Thus, a model period should be viewed
as corresponding to several decades. It is important to see what does not enter the utility
function. Specifically, labor or leisure time does not enter and the welfare of one’s children
does not enter. The first of these will imply that young agents will choose to spend all
available time (exactly one unit) working when given any positive compensation as work
produces no disutility. The second of these implies that while these agents end up producing
children they will not end up saving to provide a bequest for their children.

U(cy , co ) = α log(cy ) + (1 − α) log(co )


3. Endowments: At each date, each young agent has one unit of work time but owns no
physical capital. Old agents cannot work. Thus, aggregate labor Lt = N equals the number of
young agents in the economy. At the starting date (t = 1) of the economy the initial physical
capital is owned (evenly) by all the old agents.
4. Technology: The technology has constant returns. Physical capital can be accumulated.
β 1−β
Yt = F(Kt , Lt ) = AKt Lt

Kt+1 = Kt (1 − δ ) + It
5. Accounting Framework: Aggregate consumption and investment equal GDP. GDP is
produced with aggregate capital and labor. GDP also equals labor income plus capital
income, where w and R are rental prices of labor and capital.

Ct + It = Yt = F(Kt , Lt ) = wt Lt + Rt Kt

Ct = Ncyt + Ncot

5.2 How the Benchmark Model Works


We assume that the model economy allocates consumption, labor, capital through the price system.
Each period there are several markets: a labor market, a rental market for capital and a market
for risk-free loans. The interaction between consumers who are on the supply side for labor and
firms (say one firm) who are on the demand side for labor will determine a real wage wt for labor at
any time period t. The rental market for capital determines the rental rate Rt for physical capital.
5.2 How the Benchmark Model Works 69

Consumers will hold all capital and they make up the supply side of the market, whereas firms
are on the demand side. It will turn out that the rental market price Rt and the real interest rate rt
on loans are closely connected. We will come to the loan market later on but it is fair to say that
this market will be special since, absent government debt, consumers will be on both supply and
demand sides.
We assume that all markets in the model are competitive. Thus, at any time t, consumers will
take the wage wt , the rental rate Rt and the real interest rate rt paid as given and beyond their control.
Each young consumer born at time t will then make a consumption-saving-labor plan over the
lifetime (cyt , cot+1 , at+1 , lt ). This plan solves the problem of maximizing utility over the lifetime,
given the budget constraint. This exact problem was studied in Theorem 4.2.1 from chapter 4. The
utility function of an agent is of the Cobb-Douglas form. Thus, optimal decisions take a simple
form and are listed below.

   
cyt αwt
 cot+1   (1 − α)wt (1 + rt+1 ) 
 = 
 at+1   (1 − α)wt 
lt 1

We highlight two keys points about the consumption-saving-labor plan. First, remember from
chapter 4 that when an agent can work when young but not when old then the present value of
current and future labor income is simply wt for the young agent alive in period t. Thus, these
agents are “worth” wt . Second, the savings choice at+1 of young agents in period t represents both
the holding of physical capital and real loans. As these two different assets are both risk-free in this
model it should make intuitive sense that they should have the same real return rt+1 .
We know from the growth theory chapter that in competitive theory the wage will turn out to
equal the marginal product of labor. A competitive firm will in theory choose to hire additional
labor input up to the point where the marginal product of the last worker equals the wage. But
this marginal product is in turn determined by the production function and the supplies of total
capital and labor. The connection between wages and factor inputs are listed below. We also list the
relation between the rental rate of capital Rt , the real interest rate rt and marginal products. These
follow a similar logic that was discussed in the growth theory chapter.

wt = Wt = FL (Kt , N) = (1 − β )AKt N −β = (1 − β )Akt


β β

β −1 β −1
Rt = FK (Kt , N) = β AKt N 1−β = β Akt

rt = FK (Kt , N) − δ
The three equations above hold at any time given the available quantities of capital Kt and labor
N. Labor supply Lt equals N as each of N young agents each work one unit of time. The capital
level Kt was determined by the savings decisions of young agents alive in period t − 1.
One thing that is missing from this account is an understanding of how the aggregate capital
stock changes over time. What determines how the aggregate capital stock moves over time? The
answer is that it all depends on savings behavior. The only agents that are going to hold savings
between periods are young agents. The reason is obvious. The young care about consumption when
old but have no source of labor earnings when old. Thus, holding assets (physical capital) is the
means to consume in old age in the benchmark model without a government. The old agents will
70 Chapter 5. Life-Cycle Model

not hold assets for an additional period as they will be dead next period and they get no joy from
bequesting assets to their children or to anyone else. Yes, the model is very stylized!
The first equation below describes how the capital stock evolves over time. Total capital holding
from period t to period t + 1 is N times the amount at+1 = (1 − α)wt of savings of each young
agent. This comes from dynamic consumer theory. Since the wage at time t is the marginal product
of labor, the first equation substitutes this in for the wage. The second equation follows from the
first by dividing each side of the first equation by labor. This is a standard and useful transformation
that was widely used in analyzing the Solow model.

β
Kt+1 = Nat+1 = N(1 − α)wt = N(1 − α)(1 − β )Akt

β
kt+1 = at+1 = (1 − α)wt = (1 − α)(1 − β )Akt

Figure 5.1: Law of Motion for the Life-Cycle Model

β
The equation kt+1 = at+1 = (1 − α)wt = (1 − α)(1 − β )Akt is very important in our analysis.
We will call this equation the law of motion for capital. Just as in the Solow model, once one knows
how the capital-labor ratio moves over time then one can easily figure out how all the other model
variables move over time. This is because the output-labor ratio, investment-labor ratio and factor
prices are all simple functions of how the capital-labor ratio moves over time.
Figure 5.1 graphs the law of motion. The current capital-labor ratio kt is on the horizontal axis
and the future value kt+1 is on the vertical axis. The places where the graph crosses the 45 degree
line are steady states. Steady states are capital-labor ratios which do not change over time. Thus,
5.3 Analyzing a One-Time Shock 71

when the economy is in steady state no variable will change over time unless the economy is hit by
an exogenous shock.
Figure 5.1 shows that there is a unique positive capital steady state and a trivial zero capital
steady state. The graph also implies that the economy converges to the positive capital steady state
from any positive initial value of the capital-labor ratio. This occurs as the graph lies above the 45
degree line to the left of this steady state and lies below it to the right of this steady state.

Table 5.1: Time Paths in the Life-Cycle Model

Capital Output Investment Consumption


β
kt yt = Akt it = kt+1 − kt (1 − δ ) ct = yt − it
k0 = 1.0 y0 = 10.0 i0 = 1.6 c0 = 8.4
k1 = 2.5 y1 = 15.81 i1 = 1.7 c1 = 14.11
k2 = 3.95 y2 = 19.88 i2 = 1.42 c2 = 18.46
··· ··· ··· ···
k∞ = 6.25 y∞ = 25.0 i∞ = 0.625 c∞ = 24.375
[NOTE: Technology (A, δ , β ) = (10.0, 0.1, 0.5) and Preferences α = .5]

It is useful to consider a numerical example to illustrate how this model works. We use the
β
law of motion graph which simply plots the equation kt+1 = (1 − α)(1 − β )Akt . This equation
allows one to calculate how the capital-labor ratio moves over time. The key inputs are the model
parameters (A, α, δ , β ) and the initial capital-labor ratio k0 = 1.0. The numerical example sets the
technology parameters to (A, δ , β ) = (10.0, 0.1, 0.5) and the preference parameter to α = .5. In
Table 5.1 we plug in the value k0 = 1.0 into the law of motion and find that k1 = 2.5. We repeat this
procedure to produce the sequence of capital-labor ratios in Table 5.1. Once one has calculated how
the capital-labor ratio moves over time, calculating all other variables of interest is straightforward.
This is because all other variables are simple functions of the capital-labor ratio.

5.3 Analyzing a One-Time Shock


Starting around 1990 there was a large immigration of Soviet Jews into Israel.1 At this time the
Soviet Union was willing to permit these Jews to leave and Israel felt obligated to accept them.
At the start of this inflow, the immigration was anticipated to be a fairly large increase in the
population of Israel. In 1990 the immigration from this source represented a population increase of
approximately 3.4 percent. It was anticipated that the population increase from this source might
lead to a 20 percent increase in the overall population in a small number of years.
We could analyze the consequences of such an increase in population within the life-cycle
model. Our analysis will be stylized and will focus on qualitative insights. To do so, we will
make three stark assumptions. First, we assume within the model that there is a one-time increase
in the population that is equally divided between old and young agents. Thus, at time t = 0 the
population increases and in all future periods the population of old and young agents remains at the
permanently higher levels. Second, we assume that the model economy was in steady state before
the one-time population increase. Third, we assume that the technology does not change over time.
Thus, the shock is to factor inputs but not to technology.
What does the model predict will be the consequences of this one-time change in the population?
To analyze the model it is best to start with the law of motion for the capital-labor ratio. We note
that this law of motion (the graph of the relationship between kt and kt+1 ) is determined by
technology parameters and by preferences and by nothing else. Within the model it is true that
1 Read the Wikipedia entry for "1990s Post-Soviet aliyah".
72 Chapter 5. Life-Cycle Model

neither technology parameters (e.g. β and A) nor preference parameters (e.g. α) change. Thus, the
law of motion graph does not move as a result of the one-time change in the population.
Although the law of motion does not change, what does change is the capital-labor ratio. The
one-time increase in the population decreases the capital-labor ratio simply because the denominator
(labor input) grows while the capital stock at least initially stays constant. After this one-time
change occurs, then the law of motion tells us that the economy returns over time to the steady-state
level k∗ of the capital-labor ratio.
Model Predictions:
1. Capit-labor ratio: The capital-labor ratio falls at the time of the shock but afterwards
increases monotonically over time to return to the steady-state level.
2. Output-labor ratio: The output-labor ratio falls at the time of the shock but increases
monotonically over time to the steady-state level afterwards. This occurs because in any time
period yt = AF(kt , 1) and because the production function is increasing in kt . Even though
the output-labor ratio falls, total output increases over time in the model. In fact, the model
implies that over time output must increase by exactly the percentage increases of the labor
input.
3. Wage: The wage wt falls at the time of the shock but increases in each period after the shock.
This occurs because in any time period wt = AFL (kt , 1) and because the marginal product of
labor is (by the Cobb-Douglas assumption) increasing in the capital-labor ratio.
4. Investment: The model predicts a boom in total investment. We can see this directly from the
behavior of the capital-labor ratio. This ratio initially falls entirely because the denominator
(labor) increases. The denominator then stays fixed. Thus, the only way for the capital-labor
ratio to return to steady state is for the total capital stock to increase.
What Actually Happened to Israel?
There is evidence that around the time of the population change there was (1) a strong increase
in GDP growth rates and (2) an investment boom. Both of these are predicted by the life-cycle
model.

5.4 Can Model Allocations Be Improved?


Can life be improved for the people who live in the life-cycle model over the outcome produced by
the market? To address this issue, it will be important to develop a notion of when one allocation is
better than (a welfare improvement over) another allocation. A standard way that economists think
about welfare improvements is by using the Pareto criteria.2
According to the Pareto criteria, one feasible allocation is Pareto efficient if there is no alternative
feasible allocation that makes at least one individual strictly better off in terms of utility and no
one else worse off. A feasible allocation is not Pareto efficient if there is a Pareto improvement
- a feasible allocation that improves utility for at least one person without lowering utility for
any others. In the life-cycle model a feasible allocation amounts to a situation in which the total
consumption plus investment at any point in time is no more than the output produced with capital
and labor inputs that are available at that time.
In the context of the life-cycle model there are many agents. Remember that agents are born
at each date. Thus, there are infinitely many agents whose utility we must consider in deciding
whether life can be improved in the Pareto sense. Each of these agents is assumed to care only
about their own consumption profile so that only consumption directly determines utility.
The key result on welfare is the Proposition below.3 The Proposition deals with model
2 See
Wikipedia for a brief description of Vilfredo Pareto’ scientific contributions.
3 This
proposition is a version of the “first welfare theorem” which asserts that under appropriate conditions any
allocation produced by perfectly competitive markets is Pareto efficient. One can find a discussion of the first welfare
5.4 Can Model Allocations Be Improved? 73

economies that are more general than the life-cycle model laid out earlier.4 It says that as long as
the real interest rate period by period is always positive, then the allocation produced by competitive
markets within the model is Pareto efficient. Thus, if this positive interest rate condition holds,
then no Pareto improvements can be made. This holds even if we add an all-powerful being to the
model. Specifically, if this all-powerful being can choose an alternative allocation, then it cannot
improve welfare in the Pareto sense while using the production technology that is available within
the model.

A1: The utility function U(cyt , cot+1 ) is increasing in both components and has a well-defined
marginal rate of substitution.
A2: The production function Ft (Kt , Lt ) is constant returns to scale for any time period t.
Furthermore, the production function together with the depreciation rate δ imply that the capital
stock and output must remain bounded.

PROPOSITION: Consider any version of the life-cycle model that satisfies assumptions A1
and A2.
If the allocation produced by such a life-cycle model with competitive markets has the property
that 1 + rt > 1 + ε for all time periods t ≥ 1 for some positive number ε > 0, then the allocation
produced by this model is Pareto efficient.

Heuristic Argument for the Proposition5


Step 1: (Method of Proof) We argue that a Pareto improvement over the allocation produced
by the model is impossible. Specifically, we will first assume that there is a “new allocation”
that produces a Pareto improvement. We then argue that this implies that the purported Pareto
improvement is not feasible.
Step 2: (Improve Utility for the Old) Suppose that the first deviation from the allocation
produced by the model occurs at time t. If the new allocation is a Pareto improvement, then it
cannot take consumption away from the old agents at time t because this will clearly make them
worse off. The remaining possibility is that the new allocation gives more to old agents at time t.
Suppose the new allocation gives these old agents a small amount ∆ > 0 of extra consumption.
Step 3: (Snowball Effect) The young at time t cannot be made worse off if the new allocation is
a Pareto improvement. Thus, if each young agent gives up ∆ > 0, then they need to be compensated
by extra consumption when they are old. The compensation needed is their marginal rate of
substitution times ∆ or |MRS(cyt , cot+1 )| × ∆. This is also equal to one plus the interest rate times ∆
since the marginal rate of substitution must equal one plus the interest rate for optimizing agents.

|MRS(cyt , cot+1 )| × ∆ = (1 + rt+1 ) × ∆ > ∆

Step 4: (Snowball Grows Without Bound) One period in the future (period t + 1) it must be the
case that the (future) young give up consumption to finance the snowball effect in Step 3. If the
young agents at t + 1 are to be made no worse off, then they must in turn by compensated when old.
They give up an amount equal to (1 + rt+1 ) × ∆ when young so they must (following the logic of

theorem in most intermediate microeconomics textbooks and in Wikipedia. These treatments typically do not emphasize
the time dimension at all or do not allow for an infinite time horizon that is a feature of the life-cycle model. Economists
have understood that versions of the first welfare theorem apply to situations with both time and economic uncertainty at
least since Gerard Debreu’s “Theory of Value” published in 1959 by Yale University Press.
4 They are more general in that they allow a larger class of utility functions U and production functions F than are

used in the benchmark life-cycle model. The benchmark model uses Cobb-Douglas production and utility functions.
5 A heuristic argument is one that is suggestive but not definitive in settling an issue. An argument that proves the

Proposition can be made, but it will be less transparent for many readers than the heuristic argument.
74 Chapter 5. Life-Cycle Model

Step 3) be compensated by their marginal rate of substitution times this amount. This is given in
the equation below.

|MRS(cyt+1 , cot+2 )| × (1 + rt+1 ) × ∆ = (1 + rt+2 )(1 + rt+1 ) × ∆

Repeating this argument, the compensation grows in each generation since the real interest rate
in each generation is positive.
Step 5: (New Allocation Is Not Feasible) The upshot of Step 4 is that the snowball effect
implies that eventually the transfer of resources from young to old will be arbitrarily large (i.e.
larger than any fixed number). This is infeasible as it was assumed that the production technology
implies that capital and output must remain bounded. Therefore, it is not feasible to make a small
gift of ∆ > 0 to the old agents without decreasing the utility of any other agents. This ends the
Heuristic argument.

Some Comments Related to the Proposition:


1. The Proposition has an interest rate condition. This condition is related to the Golden Rule
literature. It comes close to saying that the model economy is below the Golden Rule
capital-labor ratio. Because of this, one does not have a simple argument that welfare can
be improved in the model economy by consuming more in some date in aggregate without
having to consume less at any other date. Recall that some evidence was provided in Chapter
3 of the book that argued that the U.S. economy appears to be well below the Golden Rule
capital-labor ratio.
2. One should read the Proposition with some caution. For example, certainly it does not
establish that actual economies with positive real interest rates produce Pareto efficient
allocations. The model is, after all, highly stylized. The Proposition does imply that within
the model any government policy that taxes agents and/or firms to make transfers to agents
will not be able to achieve a Pareto improvement.
3. If one is going to interpret business-cycle facts within such a model, then on the basis of this
Proposition one might conjecture that such a framework will not offer easy arguments for
why government tax-transfer programs can improve welfare. We will come back to this issue
when we discuss business-cycle fluctuations and when we discuss fiscal policy issues in later
chapters. However, it is fair to say that if one wanted a theoretical model that allowed for
government tax-transfer policies to achieve Pareto improvements, then one would need a
model with some features beyond those considered so far. The only problem articulated so
far concerning the functioning of markets within the life-cycle model is that it is theoretically
possible that too much physical capital may be accumulated.
4. This Proposition does not imply a vast role for government. There is an old view that
government’s role in the economy should be somewhat limited. For example, Adam Smith
argued in the Wealth of Nations that the three “duties of the sovereign” were (i) (defense)
“protecting the society from the violence and invasion of other societies”, (ii) (justice)
“protecting, as far as possible, every member of society from the injustice or oppression
of every other member of it” and (iii) (public works) “errecting and maintaining those
public institutions and those public works, which, though they may be in the highest degree
advantageous to a great society, are, however, of such a nature, that the profit could never
repay the expense”.6

6 See Adam Smith (1776, Ch 1 - Book V) The Wealth of Nations.


5.5 Review of Marginal Conditions 75

5.5 Review of Marginal Conditions


It is useful at this stage to review how at an abstract level interest rates are related, within the model,
to other (observable) variables. While this was implicit in the previous analysis of the model, now
we highlight these connections.
The life-cycle model implies two types of restrictions on interest rates. From consumer theory,
the slope of the budget line must equal an individual’s marginal rate of substitution between goods.
This holds when the consumer is making a best decision. When one applies this to choice between
the consumption good in period t and period t + 1, then this says that one plus the real interest
rate must equal this intertemporal marginal rate of substitution in absolute value. This marginal
rate of substitution is related, using the utility function u(cy , co ) = α log(cy ) + (1 − α) log(co ), to
consumption growth rates. This utility function says that high interest rates must coincide with high
growth rates of consumption for the young cohort.

u1 (cyt , cot+1 ) αcot+1


1 + rt+1 = −MRS(cyt , cot+1 ) = =
u2 (cyt , cot+1 ) (1 − α)cyt

1 + rt+1 = 1 + FK (Kt+1 , Lt+1 ) − δ


The second restriction on the interest rate comes from a simple arbitrage argument. An
individual living in this economy can convert goods in period t into goods in period t + 1 in two
ways. An individual in the model can save in terms of a financial asset and convert 1 time t good
into 1 + rt+1 time t + 1 goods. An individual can also do this conversion in another way. The
individual can convert 1 unit of time t goods into 1 unit of capital. One period in the future the
individual can rent the 1 unit of capital and receive FK (Kt+1 , Lt+1 ) units of consumption goods.
The individual can then (after production) sell the remaining capital for 1 − δ units of the time t + 1
consumption good. Remember that capital depreciates at rate δ across periods. In total the 1 unit of
time t consumption goods have then been converted into 1 + FK (Kt+1 , Lt+1 ) − δ units of the time
t + 1 good.7
Arbitrage will in theory force 1+rt+1 = 1+FK (Kt+1 , Lt+1 )−δ . If 1+rt+1 < 1+FK (Kt+1 , Lt+1 )−
δ this individual can make a lot of money - infinite amounts - on a risk-free basis by bor-
rowing at rate r and purchasing physical capital which deliver a larger (risk-free) return. If
1 + rt+1 > 1 + FK (Kt+1 , Lt+1 ) − δ , then no one will want to hold physical capital as financial returns
are greater than physical capital returns.
We now see that the life cycle model says that if the real interest rate is high, then two things
must occur. First, consumption growth of young agents needs to be high. This is the first restriction
on the interest rate discussed above. Second, the marginal product of capital needs to be high.
This is the second restriction discussed above. We know from growth theory that this marginal
product can be high when the technology is (temporarily) high and/ or when the capital-labor
β −1 1−β
ratio is relatively low because FK (Kt , Lt ) = β AKt Lt = β A(Kt /Lt )β −1 for a Cobb-Douglas
production function.

5.6 Life-Cycle Model: One Downside to the Simple Formulation


In the benchmark life-cycle model, consumers in the model hold all capital and labor and rent these
each period to the firm. This simple formulation implies that the behavior of the firm is simple. The
firm decides separately for each period how much capital and labor to rent to maximize period profit.
7 For simplicity the model assumes that consumption goods can always be converted into capital goods on a one-for-
one basis and capital goods can be converted back into consumption goods on a one-for-one basis. This assumption
effectively means that there are no capital gains or capital losses to be had in holding capital goods over time. Capital
gains are a major source of the variation in stock returns for companies listed on the New York Stock Exchange.
76 Chapter 5. Life-Cycle Model

It does not make long-term investment decisions in how many buildings and machines to purchase
and hold between model periods. Instead, consumers make all of these long-term investment
decisions. However, in modern economies firms often own the buildings and machines that they use.
Thus, unlike in the life-cycle model, firms in modern economies make many investment decisions.
The purpose of this subsection is to outline how the analysis would have to be modified if the
firm were to make investment decisions and what might be gained by doing so.
The objective of the firm would need to change. The new objective would be that the firm
chooses investment and labor over time so as to maximize the discounted value of "dividends" over
the infinite future for its shareholders. The "dividend" of the firm at time t would be the firm’s
output less wages and less investment: dividendt = F(Kt , Lt ) − wt Lt − (Kt+1 − Kt (1 − δ )). The
idea of discounting is that a given size of a dividend in the current period or one period in the future
have different current values when the real interest rate is positive. Real interest rates serve the role
of discount factors. This notion of discounting is the same as notions of present value from the
chapter on consumer theory.
Consumers in the economy would buy and sell shares in this firm. The owners of these shares
receive the dividend payments and, thus, these shares have value. The value of these shares would
be a key component of wealth for consumers. In theory the value of the firm’s shares would equal
the value of discounted future dividends. In this way, the value of the firm would correspond to the
notion of the stock market value of firms in modern economies. Formulating the model in this way
would allow an analysis of how shocks to the economy impact output, wages and interest rate (as
before) and how the same shocks impact the stock market value of the firm (not covered before).
Thus, the benefit of modeling firm investment decisions would be a theory of how shocks move the
stock market and the overall economy.

5.7 Key Concepts


A feasible allocation is an allocation that can be produced with the available inputs and
technology.
A feasible allocation is Pareto efficient provided that there is no alternative feasible alloca-
tion such that at least one consumer is better off and no consumer is worse off.
6. Business-Cycle Fluctuations

A sure way to start a fight amongst a group of economists is to start talking with great confidence
about what produces business-cycle fluctuations. Economists largely agree on what are some of
the main statistical features of business-cycle fluctuations. This much is good news. The big
disagreements are over what type of theory might usefully explain the type of fluctuations that
are observed. The theories discussed in this chapter offer different candidates for the fundamental
driving shocks that produce business-cycle fluctuations.
This chapter is organized in five main parts. First, business-cycle facts are presented and
discussed. Second, the outlines of a class of business-cycle theories is presented that is in conflict
with a key business-cycle fact: procyclical labor productivity. Third, a technology-driven theory
of the business cycle is presented that has a mechanism for producing output fluctuations that
feature procyclical labor productivity. Fourth, the view that business cycles are technology driven
is contrasted with some quotes from John Maynard Keynes.1 According to Keynes, one key driver
of business-cycle fluctuations are the “animal spirits” of investors. Modern authors have tried to
view Keynesian “animal spirits” as rational, self-confirming fluctuations in business confidence.
Fifth, we review the logic of Robert Lucas’ provocative calculation of the maximum potential gain
to perfectly smoothing out the business cycle.

6.1 Business-Cycle Facts


An early attempt to empirically describe the regularities commonly called “business cycles" was
given in the work of Burns and Mitchell (1946) entitled Measuring Business Cycles. They had in
mind the concept of a reference cycle. Roughly, the idea was that one could chop the time series
data of many aggregate economic variables into several time segments. One then tries to determine
if on these time segments there are regular patterns among these aggregate variables even if the
resulting “cycle" is not periodic in the sense of a sine or cosine wave from mathematics.
According to Burns and Mitchell (1946, p. 3), business cycles are
1 Keynes’ influential book was published in 1936 and is modestly titled ”The General Theory of Employment, Interest

and Money”.
78 Chapter 6. Business-Cycle Fluctuations

“... a type of fluctuation found in the aggregate economic activity of nations that organize their
work mainly in business enterprises: a cycle consists of expansions occuring at about the same time
in many economic activities, followed by similarly general recessions, contractions, and revivals
which merge into the expansion phase of the next cycle; this sequence of changes is recurrent but
not periodic; in duration business cycles vary from more than one year to ten or twelve years; they
are not divisible into shorter cycles of similar character with amplitudes approximating their own."

It is not very common any more to characterize business cycle facts in the manner of Burns
and Mitchell. Instead, there is some agreement that it would be useful to divide a time series of
a variable (e.g. the log of real GDP) into a trend component and a cycle component. Thus, for a
variable yt , one would say yt = ytrend
t + ytcycle . Figure 6.1 carries this out using US data on GDP and
using a particular way of defining the smooth trend line. The cyclical component of log GDP in a
given time period is then the vertical distance in Figure 6.1 between log GDP and the smooth trend
line for log GDP.
To determine the cyclical component of a series, one needs to make a couple of decisions. First,
it is common to take the logarithm of many variables such as GDP, consumption, investment, and
wages. This is done in Figure 6.1 where we graph the log of US GDP. Intuitively, if the components
of GDP are moving around a roughly constant trend growth rate, then taking the log will mean that
these transformed variables are moving around a “nearly” straight trend line. Also taking logs will
mean that deviations from trend can be interpreted as percentage deviations from trend.2
Second, one needs an operational definition of trend for each series. Figure 6.1 is based on
a standard method for defining a smooth trend line in a business-cycle context.3 It is clear from
Figure 6.1 that the trend line is smoother than the data. Thus, the wiggles in the data that occur at a
frequency of several years (high frequency variation) are included in the cyclical component of
GDP. It should also be intuitively clear that the smooth trend line does change slope as the average
slope over a period of a decade or more are quite different over time. Thus, the trend component
picks up some low frequency movement in output so that some of this low frequency movement in
output is not present in the cyclical component as defined by this procedure. The low frequency
movements could also be viewed as “slow moving” components.
Figure 6.2 graphs the cycle component of output ycycle as well as the cycle components of
consumption and investment. The cycle components of output (GDP) and consumption tend to
move up and down together and have a similar magnitude. The cycle component of investment is
much more volatile than output and investment tends to move up and down together with output.
Figure 6.3 graphs the cyclical component of output and the cyclical component of total labor
hours. One can see from the graph that labor hours are about as variable as output and tend to
move up and down together with output. Business cycle facts will simply involve quantitatively
2 A consequence of taking logs is that the cycle component can be viewed as the percentage deviation of the (unlogged)

variable from the trend component. To see this let lower case letters denote the log of the upper case letters so that
.
yt = log(Yt ) and note that log(1 + g) = g when g is small:

cycle .
yt ≡ yt − ytrend
t = log(Yt ) − log(Yttrend ) = log(Yt /Yttrend ) = log(1 + (Yt −Yttrend )/Yttrend ) = (Yt −Yttrend )/Yttrend

3 Givena time series of a variable (y1 , y2 , ..., yT ), the method developed by Hodrick and Prescott involves choosing
the trend terms (ytrend
1 , ytrend
2 , ..., ytrend
T ) to minimize the following expression when the smoothing parameter λ is set to
λ = 1600:
T T −1
∑ (yt − ytrend
t )2 + λ ∑ [(ytrend trend
t+1 − yt ) − (ytrend
t − ytrend
t−1 )]
2
t=1 t=2
Intuitively, minimizing the objective function implies that deviations from trend (the first term) should not be too large
and that movements in the trend itself should not be too large (the second term).
6.1 Business-Cycle Facts 79

Figure 6.1: Defining a Trend

US Log GDP 1948‐2012: Data and Trend
‐7.6
1940 1950 1960 1970 1980 1990 2000 2010 2020
‐7.8

‐8

‐8.2
Log Units

‐8.4

‐8.6

‐8.8

‐9

‐9.2
Year

Log GDP Trend

documenting how variable different series are and the degree to which different series tend to
move in the same direction as compared to the cyclical component of GDP. We will use standard
measures used in a course on statistics to state a measure of variability and a measure of how a
series moves in relation to GDP.
Table 6.1 below documents some of the properties of the series displayed in Figure 6.2 and
Figure 6.3. First, we discuss the amplitudes of the cycle components of each series. Table 6.1
documents that consumption of nondurable goods and services are less variable than GDP but that
investment is substantially more variable than GDP. We use the standard deviation from elementary
statistics as a measure of the variability or amplitude of a series. A one standard deviation movement
of GDP from trend is a 1.7 percent movement, whereas corresponding one standard deviation
movements of consumption of nondurable goods and services and investment are a 0.9 percent
movement and a 5.0 percent movement, respectively.
80 Chapter 6. Business-Cycle Fluctuations

Figure 6.2: Cyclical Components of Output

US Business Cycles: Output Components
0.15

0.1
Business Cycle Component

0.05

0
1940 1950 1960 1970 1980 1990 2000 2010 2020

‐0.05

‐0.1

‐0.15
Year

Consumption Investment Output

Table 6.1: US Business-Cycle Facts


Variable Amplitude Correlation w/ GDP
(Std. Deviation)
GDP .017 1.0
Consumption Nondurables .009 0.79
Consumption Durables .051 0.65
Investment .050 0.84
Government Spending .032 0.05

Labor Hours .015 0.87


Employment .013 0.81
Labor Productivity .008 0.36
Source: US Bureau of Economic Analysis 1948- 2012 quarterly data

Table 6.1 also examines the cyclical component of total labor hours in the US. Our measure
of total labor hours is approximately as volatile as GDP with a one standard deviation movement
equal to 1.5 percent. This fact is apparent from Figure 6.3. An interesting fact from Table 6.1 is
that the cyclical component of employment (the number of people working) is nearly as variable
as the cyclical component of total labor hours. Note that total labor hours equals the number of
people working times the average hours worked per employed person. Thus, Table 6.1 implies that
the bulk of the cyclical variation in total labor hours comes from people moving into and out of
employment rather than all employed people varying their work hours over the business cycle.
6.1 Business-Cycle Facts 81

Figure 6.3: Cyclical Components of Output and Labor Input

US Business Cycles: Output, Labor and Productivity
0.06

0.04
Business Cycle Component

0.02

0
1940 1950 1960 1970 1980 1990 2000 2010 2020
‐0.02

‐0.04

‐0.06

‐0.08
Year

Output Labor Hours Labor Productivity

Next we discuss the correlation of the cyclical component of each series with the cyclical
component of GDP. It is true that any time series is perfectly correlated with itself. Thus, the
correlation of GDP with itself is 1.0. Both consumption and investment are procyclical in that they
display strong positive comovement with GDP. The correlation is .79 for consumption and .84 for
investment. Government spending has a small positive correlation with GDP equal to .05. This
measure of government spending includes government spending on goods and services (e.g. public
education expenditures of state and local government and federal defense expenditures) but does
not include transfer payments (e.g. social security checks).
On the factor inputs side labor hours are procyclical in that they have a strong positive correlation
of .87 with GDP. In addition, labor productivity (GDP divided by labor hours) is procyclical with a
correlation with GDP of .36. Both of these last two facts are illustrated in Figure 6.3. Although it is
not clear just now, it soon will be clear that this last finding (i.e. labor productivity is procyclical) is
quite interesting from the perspective of theoretical models that rely on an aggregate production
function. We will discuss this issue in the next section. Figure 6.3 seems to show that cyclical
component of labor productivity has started to move in the opposite direction from the cyclical
component of GDP in the last few decades.
To sum up, Table 6.1 documents some facts about aggregate fluctuations in US data. A theory
of these fluctuations should say what are the fundamental sources (shocks?) that drive these
fluctuations. A good theory would describe why the proposed fundamental sources lead both hours
and GDP to move together as well as labor productivity and GDP to move together. A good theory
would also describe why the same sources lead investment to fluctuate much more in percentage
terms than GDP and non-durable consumption to fluctuate in percentage terms a bit less than GDP.
82 Chapter 6. Business-Cycle Fluctuations

One might ask for a lot more but this would be a very good start. Thus, a theory could rely on
some exogenous elements (i.e. unexplained shocks) while having implications for the endogenous
variables (i.e. output, consumption, investment, hours and labor productivity).

6.2 Outlines of an Unsuccessful Theory


At this stage it is useful to describe features of a class of theoretical models of business cycles that
will be inconsistent with observation. While this may seem like an odd strategy, it will prove to be
useful. The reason is that once one clearly lays out a robust and challenging fact for theory, then this
helps to outline some options for constructing theory that is not automatically counterfactual. Such
an approach is also useful as a concrete illustration of how the interplay between theory and data
helps to organize ones thinking about the plausibility of broad classes of business-cycle theories.
It turns out that it will be useful to focus on labor productivity. Labor productivity is procyclical
in U.S. data as the analysis in Table 6.1 of the previous section documented. Thus, when log GDP
is above trend then log labor productivity, where labor productivity is defined as the ratio of GDP
to labor input, tends to also be above trend.

Figure 6.4: A Mechanism Producing Counter-Cyclical Labor Productivity

Now let us think about this empirical fact within a simple model with an aggregate production
function.4 More specifically, let us think of models in which (i) output is produced by an aggregate
production function Yt = At F(Kt , Lt ) and (ii) neither technology At nor capital Kt changes over time.
4 The business-cycle fact table from the previous section organized the “facts” already around the concept of an

aggregate production function.


6.2 Outlines of an Unsuccessful Theory 83

So far we have not made any explicit assumption on whether consumers are rational or irrational
or on what the sources of shocks are that lead to variation in labor input. We have ruled out, by
assumption, a role for technology shocks.
One can now ask whether the data produced by any theoretical model with these ingredients
will produce pro cyclical labor productivity. The answer will be no for production functions with
constant returns and diminishing marginal products. Times of high labor input will be times of
both high output and low labor productivity, other things equal. Figure 6.4 graphs two data points
(L1 ,Y1 ) and (L2 ,Y2 ) that are consistent with such a production function. Labor productivity is
simply the slope of the line segment connecting the origin to the data point (L1 ,Y1 ) or the origin
to the data point (L2 ,Y2 ). Clearly, the slope produced by the latter point is smaller than the slope
produced by the former point in Figure 6.4.
Under the stated assumptions, labor productivity must be low when output is high (e.g. output
level Y2 ) by the diminishing marginal product of labor. Thus, if the model has fluctuations in output
driven by fluctuations in aggregate labor then it also has to have countercyclical labor productivity.
This logic holds independently of the motivations behind consumer behavior - the consumers
populating such a model could be rational or irrational. It also holds independently of the source of
the variation in labor input.5
Now it is important to ask how one could alter the assumptions on technology so as to allow a
model with an aggregate production function to produce procyclical labor productivity. We consider
two possibilities. Each possibility in Figure 6.5 “explains” the same data as the two data points are
the same in Figure 6.5.
Possibility 1 is to allow the technology to change over time. This is displayed on the left-hand-
side of Figure 6.5. If labor input and technology move together in that their deviations from trend
are positively correlated, then times of high output will be times when both technology and labor
input are high. It can then be the case that high output is associated with high labor productivity. The
reason is that technology and labor have offsetting effects on labor productivity. More specifically,
higher labor decreases labor productivity other things equal but higher technology increases labor
productivity other things equal.
Possibility 2 is that technology does not change over time but that the production function
exhibits increasing returns in labor input alone. This is displayed on the right-hand-side of Figure
6.5. Thus, a single curve passes through both data points. To pass through both points (and the
origin) the marginal product of labor increases in the labor input. Then one can see that times of
high labor input are times of high labor productivity so that again labor productivity is procyclical.
The problem with this explanation is that it implies a form of increasing returns to scale strongly
at odds with that estimated in a large empirical literature (both at the aggregate level and at lower
levels of aggregation such as sector, industry or firm levels) on this issue.
In what follows, we will choose to maintain the assumption that there is an aggregate production
function with standard properties (e.g. constant returns and diminishing marginal products). These
are precisely the key properties assumed and analyzed in standard growth theory. Given this
decision, we have articulated only one way to escape from countercyclical labor productivity. Thus,
we will shortly analyze a theory of the business cycle where technological shocks are the only
source of fluctuations. This follows the line of work for which Finn Kydland and Edward Prescott
received the Nobel prize in 2004.
One might conjecture that many other shocks besides technology shocks could be potentially
5 The argument does make use of an other things equal assumption in that capital is held constant. Intuitively, with a

constant returns production function, if capital and labor both increase by the same percentage between two points in
time, then labor productivity is constant when output is high or low. This implies that what is happening to capital may
be important empirically for procyclical labor productivity. As long as the movements in capital input are smaller in
percentage terms than labor and not perfectly correlated, then the production function will not produce procyclical labor
productivity when the technology is constant returns to scale and technology does not change over time.
84 Chapter 6. Business-Cycle Fluctuations

Figure 6.5: Two Mechanisms Producing Pro-Cyclical Labor Productivity

relevant (e.g. wars, demographic shocks, changes in government policy, news about the likelihood
of future shocks or future policy changes, changes in uncertainty and animal spirits) for business
cycle fluctuations or, more broadly, for aggregate fluctuations. However, the assumption of an
aggregate production function with constant returns and diminishing marginal products and no
technological change effectively implies that theories built from any combination of these “other”
shocks will produce countercyclical labor productivity if they have their effects only through their
impact on the quantities of factor inputs. Thus, this line of reasoning which employs an aggregate
production function suggests a key role for technology shocks but does not imply that other sources
of shocks are unimportant.
We acknowledge that this line of reasoning relies heavily on the aggregate production function.
It is of course possible to build up theories of the production side of the economy by aggregating the
behavior of many small firms. When all such small firms have the same constant returns production
function and behave competitively, then the theory implies that the economy functions as if there is
one firm with an aggregate production function which is the same function as that of any of the
small firms. When all small firms do not have the same production function and there are frictions
that do not make it easy to reallocate capital and labor across firms, then this gives rise to a richer
menu of possible reasons for why aggregate labor productivity is procyclical. Such theorizing also
suggests that it is micro-level data that is key for analyzing sources of movements in aggregate
productivity. The analysis of such a framework and corresponding micro-level data is well beyond
the scope of this book.
6.3 Technology Shocks and Business Cycles 85

6.3 Technology Shocks and Business Cycles


Consider the life-cycle model from the previous chapter but with changing technology. At this
stage it is useful to discuss two points before diving into a discussion of this model. First, it is
useful to revisit the discussion of growth accounting from the chapter on growth theory. Recall
that a straightforward growth accounting analysis of the growth of aggregate output, capital and
labor in the US economy comes to two conclusions: (1) technology growth is a major source of the
growth in output or output per unit of labor input and (2) technology growth that we back out of
the Solow growth accounting framework is procyclical in annual data in that technology growth
and output growth are positively correlated. Taking the second finding at face value, a technology
driven model of the business cycle seems to be interesting in that the technology growth measure
is procyclical in data and in that such a procyclical pattern is one way to understand why labor
productivity can be procyclical in simple theoretical models.
The measure of technology growth coming directly out of a growth accounting exercise is
(under the assumptions of the theory) a combination of true technology growth and the errors in
measuring output and input growth rates. On this basis, some economists question whether true
technology growth is procyclical or is as variable as the accounting framework suggests depending
on their view of the likely nature of measurement error.
The second point to mention is that the theoretical model analyzed in this section deals with
only deterministic movements in technology over time. An intuitive view is that some component
of technological change is random in that future technology levels are not perfectly forecasted
based upon current information. While this seems plausible, we choose to analyze the simple and
not very realistic case of deterministic but variable technological change. The reason is purely
technical. If the technology within the model were random, then young agents would want to take
this randomness into account in making savings plans.
To deal with this, one would need to develop a more sophisticated theory of choice under
uncertainty (e.g. expected utility theory). While this is not too difficult and is pursued to a limited
degree later in this chapter, we choose not to develop business-cycle theory with random technology.
It is clear that allowing for such randomness would be important in evaluating the issue of whether
there may be welfare gains (in the Pareto sense) to pursuing policies that attempt to smooth out the
business cycle or that attempt to provide insurance beyond that offered by financial markets for
such randomness.

6.3.1 A Model with Technology Shocks


The model we will consider is the life-cycle model from the previous chapter with one change. We
β 1−β
now let the technology term At hitting the production function Yt = At F(Kt , Lt ) = At Kt Lt move
up and down over time. To connect to the growth theory literature, one could view the technology
level as moving up and down around some positive trend growth path. We continue to assume that
the production function is of the Cobb-Douglas form, that the generational structure is the same as
before with equal numbers of young and old agents in each time period and that consumers have
preferences over lifetime consumption plans U(cyt , cot+1 ) = α log(cyt ) + (1 − α) log(cot+1 ).
The upshot of these assumptions will be that this model can be analyzed using the type of
graphical analysis from the previous chapter. The only difference from the previous chapter is that
the law of motion graph for the capital-labor ratio shifts over time because the technology At moves
over time. Thus, the law of motion follows the equation below. Figure 6.5 is a graph of this law of
motion for two values of the technology variable A.

β
kt+1 = (1 − α)(1 − β )At kt

To see how the model works, consider a special and very simple case where the technology
86 Chapter 6. Business-Cycle Fluctuations

undergoes a one-time, permanent increase. At time t = 1 and all future periods the technology level
undergoes a permanent increase to level Ahigh which exceeds the previous level labeled Alow . We
assume that the capital stock at time t = 1 inherited from past decisions is initially at the steady
state level associated with the previous (lower) level of technology Alow . The upshot of these
assumptions is that the law of motion for capital shifts permanently upwards and that because of
this the capital stock increases over time until it converges to the new higher steady state level.

Figure 6.6: Law of Motion: High and Low Technology Levels

It is easy to figure out how the change in technology affects other variables. First, as the capital
stock increases there is an investment boom - recall that it = kt+1 − kt (1 − δ ). Second, the level of
β
GDP per unit of labor (labor productivity) increases over time - recall yt = At kt . At the time of
the technology improvement this is entirely due to technology term At . However, there is also a
delayed effect on yt due to the increase in the capital stock kt induced by the technology change.
β
Third, there is an increase in the real wage wt over time - recall wt = (1 − β )At kt . This increase
in wages occurs initially because of the increase in technology but is reinforced by the induced
increase of the capital-labor ratio kt .
One notable problem with this simple model as a model of business-cycle fluctuations is that
by assumption labor does not move in response to anything in the model. The young work all
the time regardless of the precise level of the wage rate. This could be addressed by allowing an
alternative use for the time of young agents. One way to do this is to allow agents to split their total
time allocation of one unit between work and leisure. Thus, agents would need to decide whether
to work more or less when wages are high. In the model at present young agents spend all their
time working as leisure by assumption is not valued as it does not enter the utility function. A
6.4 The Keynesian View 87

more plausible model would let utility depend upon leisure time as well as goods consumption.
This change would give rise to a much richer model that is the focus of the bulk of business-cycle
literature in the last several decades. This book lets you drive the Ford Model T, as its owners
manuel is easy to understand, while the Ferrari stays in the garage.

6.4 The Keynesian View


John Maynard Keynes (1936) offered quite a different view of the sources of business cycles
and of the nature of consumer behavior from the view analyzed in the previous section.6 The
key intellectual problem was to account for the fall in output occuring in the Great Depression.
According to Keynes, what are the sources of business-cycle fluctuations that can be contrasted
with the technology-driven, business-cycle model sketched earlier? His view could be paraphrased
as the claim that the “animal spirits” of investors are a key source of fluctuations. The key quote
from Keynes is provided below.

“Even apart from the instability due to speculation, there is the instability due to the charac-
teristic of human nature that a large proportion of our positive activities depend on spontaneous
optimism rather than on a mathematical expectation, whether moral or hedonistic or economic.
Most, probably, of our decisions to do something positive, the full consequences of which will be
drawn out over many days to come, can only be taken as a result of animal spirits - of a spontaneous
urge to action rather than inaction, and not as the outcome of a weighted average of quantitative
benefits multiplied by quantitative probabilities.” - Keynes (1936, Chapter 12, p. 161)

It is clear that Keynes did not view his hypothesized animal spirits to be a result of a rational
calculation. Nevertheless, some economists have tried to work out theories of economic fluctuations
based on a rational version of the “animal spirits" hypothesis. A first step in this effort was to
establish the logical conditions under which swings in business confidence are rational and yet
unrelated to changes in fundamentals (e.g. changes in production functions or changes in the
population of the economy) but still affect real economic activity.7
We note that even if economic fluctuations based on this type of mechanism occur in a model
economy, this does not mean that these fluctuations will be consistent with observation. For example,
earlier in this chapter we noted that any type of labor supply behavior, rational or irrational, within
the context of a model with an unchanging, constant returns aggregate production function will
produce counter-cyclical, labor productivity. In U.S. data labor productivity is procyclical. Thus,
simply adding in the logical possibility of animal spirits into the life-cycle model together with
a labor-leisure choice, without other changes to the framework, will not produce business-cycle
fluctuations with procyclical labor productivity.
What is the nature of consumer behavior in response to the fluctuations induced by animal
spirits? According to Keynes, the answer is that consumers can be viewed as applying a rule of
thumb. This view is embodied in the “Keynesian consumption function” that is sometimes taught
in introductory classes.

“The fundamental psychological law, upon which we are entitled to depend with great con-
fidence both a priori and from our knowledge of human nature and from the detailed facts of
experience, is that men are disposed, as a rule and on the average, to increase their consumption
6 See Keynes’s (1936) book “The General Theory of Employment, Interest and Money”. The page numbers in the

quotes from Keynes’ book are from the Harcourt Brace Jovanovich, 1964 Edition. This is a very difficult book to read.
Sir John Hicks wrote a 13 page article (see Hicks (1937), "Mr. Keynes and the Classics: A Suggested Interpretation"
Econometrica, 5, 147-59) that tried to reduce the content of Keynes’ book to two equations. A version of Hicks’
interpretation is what is typically presented as the Keynesian IS-LM model in intermediate macroeconomics textbooks.
7 See Cass and Shell (1983), Do Sunspots Matter?, Journal of Political Economy, 91, 193-227.
88 Chapter 6. Business-Cycle Fluctuations

as their income increases, but not by as much as the increase in their income.” - Keynes (1936,
Chapter 8, p. 96.).

This view contrasts with the optimizing view of consumer behavior summarized in the chapter
on dynamic consumer theory. The optimizing view is presented in all microeconomic textbooks.
It is the dominant view employed in current theoretical and empirical work in the consumption
and saving literature. Two separate Nobel Prizes were awarded to Milton Friedman in 1976 and to
Franco Modigliani in 1985 for the development of this work on dynamic consumer theory.
According to the optimizing view, the response of a specific household’s current consumption
to an increase in current household income is not determined by some invariant proportion of the
percentage change in current income from the previous period’s income. Instead, the optimizing
view implies that the response of current consumption to a surprise change in current income
would depend on the length of the remaining lifetime, on preferences and on whether the change
in income is expected to be permanent or temporary in nature. Indeed, the theory predicts that
the consumption response to a temporary increase in income should be smaller than the same size
permanent increase. This implication of the theory is largely borne out in surveys of the empirical
work on this issue.8

6.4.1 A Simple Keynesian Model


We will sketch out the simplest textbook Keynesian model that is sometimes refered to as the
“Keynesian Cross” in textbooks. It may already be apparent that Keynesian models do not fit neatly
within standard economic theory. Early Keynesian models did not start out by describing utility
functions of agents and production functions of firms and then deduce a mathematical description
of how the aggregate economy behaves based on these and related assumptions. Indeed, Keynes
thought that substantial parts of economics had to be reworked or abandoned to get a theory
useful for thinking about depressions or large recessions. Thus, what is presented next has no
methodological connection with standard economics.
The simple Keynesian model starts with an income accounting identity for a closed economy.
Thus, at a point in time the theory focuses on aggregate consumption C, aggregate investment I,
government spending G and the sum of these terms equals output Y . There is no introduction of a
production function that relates output to inputs. The second equation in the simple Keynesian model
is a behavioral equation (an assumption) describing the relation between aggregate consumption C
and aggregate income Y and aggregate taxes T . The lower case symbols (a, b) in this equation are
parameters (i.e. constants). This equation is typically termed the “consumption function” and it is
consistent, when b is positive and less than 1, with Keynes’ words :“men are disposed, as a rule
and on the average, to increase their consumption as their income increases, but not by as much
as the increase in their income.” The parameter b is typically termed the “marginal propensity to
consume” in Keynesian economics. Equations 1-2 are ALL of the equations of the simple model.

1. C + I + G = Y
2. C = a + b(Y − T )
a−bT +I+G
3. Implication: [a + b(Y − T )] + I + G = Y or Y = 1−b

At a mathematical level, we will treat government spending G, taxes T and investment I as


exogenous. Thus, these variables are not explained or produced within the model. The goal of the
model is to get a theory of the level of income Y , given the exogenous variables (G, T, I). Simple
algebra then produces that there is one solution Y to these equations, given in the third equation
8 Attansio and Weber (2010), Consumption and Saving: Models of Intertemporal Allocation and Their Implications

for Public Policy, Journal of Economic Literature, vol. 48, pages 693-751.
6.4 The Keynesian View 89

above. It asserts that if for exogenous reasons investment I decreases then income Y also decreases.
Keynes’ view seems to be that investment is highly variable and is affected by the unexplained
animal spirits of investers.
Some economists and politicians try to apply this framework only when an economy is in a
recession or depression. Thus, Keynesian economics is sometimes termed "Slump Economics". A
typical argument is that there is extra labor that could be employed to increase income and output Y
as unemployment rates are high.
The issue is then what tools does a government have to increase Y ? We provide three answers.
All are based on the following relationship:

a − bT + I + G
Y=
1−b
A first answer comes from asking what is the impact on output of increased government
spending G without changing taxes T . It is straightforward to calculate the extra ouput ∆Y produced
1
by the model due to additional spending ∆G.9 The multiplier ∆G ∆Y
equals 1−b . It gives the impact on
income of increasing government spending by 1 unit. It is termed the unbalanced budget multiplier
because the extra spending is not required to the financed, in such a calculation, by extra taxes.

1 ∆Y 1
Unbalanced Budget Multiplier: ∆Y = 1−b ∆G implies ∆G = 1−b >0
1−b ∆Y 1−b
Balanced Budget Multiplier: ∆Y = 1−b ∆G implies ∆G = 1−b = 1
b ∆Y b
Tax Multiplier: ∆Y = − 1−b ∆T implies ∆T = − 1−b <0

A second answer comes from asking what is the extra income produced by increasing G and
at the same time increasing T by the same amount (i.e. ∆G = ∆T ). The answer is that income
increases exactly by the amount of the increase in spending and taxes as the balanced budget
multiplier is exactly equal to 1. The term balanced budget multiplier is not quite apt as what is
occuring in theory is the requirement that any additional spending be financed by extra taxes and
not that total spending equals total taxes.
A third answer comes from increasing taxes T , holding government G spending fixed. The
answer is that the multiplier is negative because taxes enter negatively in the expression for output
∆Y b
in the Keynesian model. The tax multiplier is ∆T = − 1−b .
The brief discussion above gives the simple prescription given by Keynesian economists (and
numerous politicians) for what to do when in a deep recession - increase spending, decrease taxes
or carry out a balanced budget increase in spending. While one can find some nuances to this
prescription in the editorial pages of many newspapers during the Geat Recession of 2008, it is
fair to say that this is a main thrust of Keynesian economics. In the fiscal policy chapter we will
revisit the issue of “multipliers”. The life-cycle model also has multipliers associated with specific
changes in tax-spending plans or specific changes in tax plans to finance a given spending plan. It
will be useful to contrast the dynamic multipliers in the life-cycle model with the multipliers arising
from Keynesian models.
One can find more complicated Keynesian models in intermediate-level textbooks. Perhaps
the most common model is refered to as the IS-LM model. This model has both a “consumption
function” and an “investment function” rather than solely a consumption function. The investment
function depends on an interest rate r - an endogenous variable. Thus, in the IS-LM model there
are two endogenous variables (Y, r). The style of analysis is the same as in the simple Keynesian
model - one traces out via algebra or graphical methods how changes in exogenous variables impact
9 The symbol ∆ is commonly used in science and mathematics to denote a difference or a change in a variable and
that is the way it is used here. Thus, ∆Y = Y new −Y old and ∆G = Gnew − Gold . In the Chapter on growth theory we used
a similar convention as part of a calculation of growth rates.
90 Chapter 6. Business-Cycle Fluctuations

(Y, r) and one figures out multipliers associated with changes in policy variables such as taxes or
spending.

6.5 Smoothing Out the Business Cycle


This section moves away from the debate about what shock or collection of shocks are the fun-
damental drivers of business-cycle fluctuations and what mechanisms help to propagate these
fundamental shocks. This section acknowledges that substantial fluctuations in aggregate output
and aggregate consumption do occur about a smooth trend line for these series but does not take a
view on how these fluctuations are produced. The question that is posed is the following: what is
an upper bound to the potential gain to completely smoothing out these fluctuations?10
We make a few key assumptions. First, individuals are assumed to have preferences over risky
consumption profiles that are described by expected utility theory. This theory will be consistent
with the view that getting a sure amount of consumption is strictly better than getting a risky
consumption gamble with the same mean or average consumption level. Second, we will need to
take a stand on the magnitude of the aversion to taking on risk that characterizes actual individuals
or households as well as a stand on the magnitude of fluctuations in aggregate consumption. Third,
it is assumed that the best that policy can do is to completely eliminate the ups and downs of
aggregate consumption but not alter the mean level of consumption at any point in time. This third
assumption is the basis for the view that what one is attempting to calculate is an upper bound to
the gain under the view that policies that fill in the troughs of the business cycle also end up shaving
off the peaks.

6.5.1 Expected Utility Theory


One simple theory of individual attitudes to risk is expected utility theory. This theory has its
origins in theories of gambling behavior. We briefly review the key aspects of this theory.
The mathematical language of the theory is as follows. Denote the payout of a gamble in state
of the world i as either xi or x(i). Each state of the world i = 1, 2, ..., n is given by a probability
denoted pi or p(i). These probabilities can be a person’s subjective probabilities or can be viewed
as objective probabilities. In either case it is assumed that the sum of the probabilities over all the
disjoint events of the world sum to 1. For example, in the case of a coin toss, the probability that
the coin lands heads plus the probability that it lands tails must sum to one. This follows standard
probability theory.

xi - payout in event i
pi - probability of event i
u(xi ) - utility realized when event i occurs under gamble x
E[u(x)] ≡ ∑i u(xi )pi = u(x1 )p1 + u(x2 )p2 + · · · + u(xn )pn expected utility of gamble x

The theory assumes that individuals rank gambles by computing expected utility. Gamble x
is strictly prefered over gamble x0 precisely when it delivers higher expected utility. To compute
expected utility, the individual is assumed to have a utility function u(xi ) describing the (ex-post)
utility associated with any payout xi . Before the agent knows the outcome of the gamble the
agent asigns an expected utility (denoted E[u(x)]) to the gamble. This expected utility is simply
the realized utility averaged using the subjective probabilities. Thus, this calculation follows the
10 This question was posed and answered in a provocative book by Robert Lucas (1987) entitled "Models of Business
Cycles", Blackwell, Oxford. Here we follow in a simplified manner the main outlines of his approach to answering this
question. A Wikipedia discussion of this issue can be found by Googling "welfare cost of business cycles". Robert Lucas
received the Nobel Prize in 1995 for his work in macroeconomics.
6.5 Smoothing Out the Business Cycle 91

calculation of expected values from standard probability theory as can be found in any book on
statistics or any account of the theory of probability.

St Petersburg Paradox
To help understand why economists have adopted this theory, we take a small detour to discuss the
St Petersburg Paradox. This is a famous problem that goes back to the work of Daniel Bernoulli in
the 1700’s.11
Consider a simple gamble that is based on the toss of a fair coin. Let H denote heads and T
denote tails. It is understood that the probability of H and T are each one half. This simple gamble
pays off an amount which is equal to 2 to the power of the number of consecutive number of tails
thrown before the first toss of heads. Thus, this gamble offers a small probability of arbitrarily large
payouts.

Table 6.2: St Petersburg Gamble


x(event) p(event)
x(H) = 1 p(H) = 1/2
x(T H) = 2 p(T H) = 1/4
x(T T H) = 4 p(T T H) = 1/8
x(T T T H) = 8 p(T T T H) = 1/16
··· ···

If this gamble is offered on a one-time basis, then a common view is that very few individuals
would pay more than 100 dollars to accept this gamble. If a specific person is willing to pay a
maximum amount which is only strictly less than all the money they have, then this fact would rule
out one straightforward theory of behavior under risk. One theory of gambling behavior asserts that
one is willing to accept any gamble with a positive net expected payout. The key feature of the St
Petersburg gamble is that it has a gross expected payout which is infinite and thus the net expected
payout is also infinite no matter how much the individual pays to get this gamble!12 This can be
seen by computing the terms of the expected payout of this gamble as is carried out below. As each
individual term is equal to one half and there are infinitely many terms the gamble therefore has an
infinite expected payout.

E[x] = x(H)p(H) + x(T H)p(T H) + x(T T H)p(T T H) + · · ·

E[x] = 1/2 + 1/2 + 1/2 + · · · = ∞


These observations motivate the question of what type of theory of choice under risk will help
us to understand why a gamble with infinite expected payout would be valued as being worth
so very little to almost every person we know. One theory which will work is to assume that
individuals care about the expected utility of the gamble rather than the expected payout of the
gamble. This can help explain the behavior exactly when the utility of the payout is increasing but
displays diminishing marginal utility. Thus, the utility function bends over so that the marginal
utility of the millionth dollar is less than the marginal utility of the first dollar. Within this theory it
is diminishing marginal utility of consumption that implies that a consumer will always prefer to
purchase an actuarially fair insurance contract over a risk as opposed to paying nothing and facing
the risk.
11 See the Wikipedia entry for the St Petersburg Paradox.
12 The gross expected payout of a gamble is E[x] = ∑i xi pi - payouts times probabilities summed over all the possible
events.
92 Chapter 6. Business-Cycle Fluctuations

6.5.2 Gain to Eliminating Business Cycles


We now use expected utility theory to evaluate the potential gain to a hypothetical elimination of
aggregate fluctuations. View a typical U.S. business cycle as providing an equal probability chance
of suffering a two percent reduction in consumption or benefiting from a two percent increase
in consumption. Recall from section 6.1 that a one standard deviation movement of aggregate
consumption from a smooth trend line is between 1 and 2 percent in quarterly U.S. data. We
will then ask how much compensation would need to be paid to an agent who experiences this
consumption risk so that his/her utility would be the same as the utility obtained in the world
without risk and with a sure consumption of 100. This calculation is stated below, where λ denotes
the required compensation and x is the consumption gamble that pays off 98 and 102 with equal
probabilities. The compensation is stated in terms of a proportional increase in consumption.

u(100) = E[u(x(1 + λ ))]

To solve this equation for the compensation factor λ we need to take a stand on a useful utility
function u to analyze. Without specifying a utility function we can say that λ is positive provided
that there is diminishing marginal utility of consumption. Below we use the class of utility functions
known as “constant relative risk aversion" utility functions. These are indexed by the parameter
γ. The literature has established that γ measures the aversion to taking on proportional gambles.
Thus, the higher is γ the higher is the compensation required to take on such risk and the greater
the benefit to eliminating such risk. If one graphs this utility function it is clear that any value γ > 0
will be consistent with diminishing marginal utility.

u(c) = c1−γ /(1 − γ) when γ 6= 1

u(c) = log(c) when γ = 1

Below we substitute this specific function into the equation above.

(100)1−γ (102(1 + λ ))1−γ (98(1 + λ ))1−γ


= (1/2) + (1/2)
1−γ 1−γ 1−γ

1001−γ
⇒ (1 + λ )1−γ =
1021−γ (1/2) + 981−γ (1/2)

1001−γ
⇒λ =[ ]1/(1−γ) − 1
1021−γ (1/2) + 981−γ (1/2)

Table 6.3: Gain to Eliminating Aggregate Fluctuations:


Coefficient of Risk Aversion (γ) Compensation (λ )
γ =2 λ = .00040
γ =4 λ = .00080
γ = 10 λ = .00199

Table 6.3 below calculates the compensation factor λ for different values of the utility function
parameter γ. The results say that for the range of risk aversion coefficients considered the elimination
of aggregate risk is only equivalent to a very small proportional rise in consumption. This calculation
6.5 Smoothing Out the Business Cycle 93

was first carried out by Robert Lucas in a slightly different way. His main conclusions are similar
to those below.13
Lucas took the position that a risk aversion coefficient of more than 10 is not reasonable. The
calculations based on γ = 4 indicate that the maximum gain to perfectly eliminating aggregate
fluctuations is less than a tenth of one percent of consumption. Aggregate consumption per person
in the U.S. economy in 2010 is about 33 thousand dollars (i.e. 10.2 trillion dollars divided by 308
million people). This number times the value of λ from Table 6.3 gives a free lunch worth $26.50
dollars per person every year from eliminating aggregate fluctuations. This number seems very
small.
One thing that is missing from these calculations is a sense of how averse actual individuals are
to taking on risk. While the literature is full of different methods for using data to get an idea of the
magnitude of risk aversion, we will not discuss this evidence. Rather we will provide a table which
may help you to understand how your own aversion to taking on risk is related to the parameter γ
of the utility function used above.14
We now ask you the following question: what fraction of your wealth are you ready to give up
to escape the risk that you gain or lose a fraction α of your wealth with equal probability? Table 6.4
gives the answer to this question using the class of constant relative risk aversion utility functions
u stated earlier. Thus, you can use Table 6.4 in two ways. First, you can pick the row which best
refects your attitudes to these two gambles and then see the risk aversion coefficient γ that would
produce these attitudes. Second, gvien a value of γ you can see how much wealth such a theoretical
agent would be willing to give up not to take these gambles. For example, if you are willing to give
up two percent of your wealth to avoid the α = 10% wealth gamble and sixteen percent of your
wealth to avoid the α = 30% wealth gamble, then your behavior is consistent with γ = 4.

Table 6.4: Fraction of Wealth Paid to Avoid a Wealth Gamble of α


Coefficient of Risk Aversion (γ) α = 10% α = 30%
γ =1 0.5 4.6
γ =4 2.0 16.0
γ = 10 4.4 24.4
γ = 40 8.4 28.7

The answer that most people would give for the fraction of wealth given up to avoid these
gambles is consistent with a γ value less than 10. If this characterizes your answer, then there are two
possible conclusions to have for the answer to the welfare gain to eliminating aggregate fluctuations.
One possibility is that some important feature of actual economies is missing from this calculation
so that the range of answers tabulated is not relevant and must await further analysis. Along this
line, an important literature in economics notes that the total consumption fluctuations experienced
by individual households is much larger in percentage terms than the observed fluctuations in
aggregate consumption data. Thus, an important debate centers on how any possible smoothing
of aggregate consumption impacts both the level of household consumption and the smoothing of
total consumption risk faced by individual households. The other possibility is that the welfare gain
to perfect business-cycle smoothing is at most a fraction of a percent of consumption each year.
Under this possibility, smoothing out the business cycle does not seem to be a big deal.
13 One difference was that instead of a two point distribution of risk, he analyzed the case where consumption risk x is
lognormally distributed. This leads to a simple approximation formula where the compensation is proportional to risk
2
aversion γ and to the variance σ 2 of the log of consumption risk: λ = γ σ2 . This gives results which are close to the
computations in the table when one sets σ 2 = .022 = .0004 so that a one standard deviation shock moves consumption
by two percent.
14 The table comes from Christian Gollier (2001, p.31) “The Economics of Risk and Time”, MIT Press.
94 Chapter 6. Business-Cycle Fluctuations

6.6 Key Concepts


Business-cycle facts are the patterns among the deviations from trend of many aggregate
economic series.
A series is said to be procyclical if the deviations of this series from trend covary positively
with the deviations of output from trend.
A series is said to be countercyclical if the deviations of this series from trend covary
negatively with the deviations of output from trend.
Labor productivity equals GDP divided by a measure of aggregate labor input.
A balanced-budget government spending multiplier measures the ratio of the change in
output produced by a change in government spending, while simultaneously increasing
government spending and taxes by equal amounts.
An unbalanced government spending multiplier measures the change in output produced
by a change in government spending, holding taxes constant.
7. Fiscal Policy

One of the most important indicators in fiscal policy analysis is the ratio of government debt to
GDP, or the debt-GDP ratio. Figure 7.1 plots the U.S. debt-GDP ratio whereas Figure 7.2 plots the
same ratio for the United Kingdom.1 There are six US episodes in which this ratio rises by at least
20 percent in a small number of years.

Figure 7.1: US Debt to GDP Ratio

The first four of these US episodes are (1) the Civil War, (2) World War I, (3) the Great
1 The US data come from Henning Bohn, "The Sustainability of Fiscal Policy in the United States" (in: R. Neck and J.
Sturm, "Sustainability of Public Debt", MIT Press 2008, pp.15-49). The data from his work is updated to include 2012
data. The data for the UK come from the Bank of England.
96 Chapter 7. Fiscal Policy

Depression and (4) World War II. The fifth US episode started around 1980 and is associated with
the label “Reaganomics”.2 The sixth and last episode is the rise in the ratio which accompanied the
recession that begun in 2008, known as the Great Recession. Both Figures show that major wars
are in practice financed largely by issuing more government debt historically in the US and the UK.

UK: Debt‐GDP Ratio (%)
300.0
250.0
200.0
150.0
100.0
50.0
0.0
1700 1750 1800 1850 1900 1950 2000
Year

UK: Debt‐GDP Ratio (%)

Figure 7.2: UK: Debt-GDP Ratio

7.1 Accounting Framework


The behavior of the debt-GDP ratio provokes a number of questions. To get at some of them, it
is useful to develop an accounting framework. The accounting equation below connects the debt
in successive years t and t + 1 to the deficit Dt in year t. It is sometimes useful to talk about a
subcomponent of the deficit called the primary deficit PDt = Gt − Tt . The primary deficit equals
government spending less taxes.

Definition 7.1.1 The Law of motion for debt states that future debt is equal to current debt
plus the deficit: Bt = Bt−1 + Dt = Bt−1 + [Gt − Tt + rt Bt−1 ]. Bt is real debt at time t. Dt =
Gt − Tt + rt Bt−1 is the deficit at time t. Gt , Tt are government spending and net tax collected at
time t and rt is the real interest rate at time t. The primary deficit is PDt = Gt − Tt .

When we apply this framework in later parts of this chapter, we will be thinking of a government
that issues default-free debt. This may describe some countries over some time periods but clearly
not all countries. In a number of countries the real market value of government debt can change by
virtue of default or by the anticipation that there has been a change in the probability that a future
default will occur. A debt default occurs when the country announces that it will miss one or many
of the scheduled payments on part or all of its debt. The experience of Argentina over 1999-2002 is
a good example of a country where government debt is viewed as subject to a potential default and
2 One
account of this episode is presented in The Triumph of Politics: Why the Reagan Revolution Failed,by
David Stockman. Stockman was appointed to be the director of the Office of Management and Budget in the Reagan
administration.
7.1 Accounting Framework 97

where the market value of government debt experienced dramatic fluctuations. The experience of
Greece and Ireland in 2010 are also good examples of countries whose debt is viewed as subject
to a potential default. The nominal interest rate implicit in the pricing of Greek government debt
was several percentage points above the comparable interest rate on German government debt. The
analysis in this chapter views govenrment debt as default free. This is a natural first step in building
theory that incorporates government spending, taxation and debt.
We will now use the accounting equation to decompose how the U.S. debt-GDP ratio fell after
World War II. Step 1 below divides the accounting equation by GDP. Step 2 expresses the debt each
year as a ratio to GDP in that same year. Step 3 puts the change in the debt-GDP ratio on the left
hand side and three separate terms that add up to it on the right hand side.

Bt Bt−1 PDt
= (1 + rt ) + (7.1)
Yt Yt Yt
Bt Bt−1 Bt−1 Yt−1 Bt−1 PDt
− = (1 + rt ) − + (7.2)
Yt Yt−1 Yt−1 Yt Yt−1 Yt
 
Bt Bt−1 1 + rt Bt−1 PDt
− = −1 + (7.3)
Yt Yt−1 1 + gt Yt−1 Yt
Equation 7.3 says that a change in the debt-GDP ratio, such as the fall in the debt-GDP ratio
after World War II, can be attributed to two main sources. The first source is a primary surplus,
captured by the term PD
Yt < 0.A negative
t
primary deficit is a primary surplus. The second source is

1+rt Bt−1
composed of all other terms, 1+gt − 1 Yt−1 . These terms encompass both the interest rate effect
and the growth effect. This term can be negative, leading to a reduction in the debt-output ratio,
when the GDP growth rate exceeds the interest rate.

Figure 7.3: Sources of Changes in the US Debt-Output Ratio After 1950

Change in US Debt‐GDP Ratio
0.15

0.1

0.05

0
1940 1950 1960 1970 1980 1990 2000 2010 2020
‐0.05

‐0.1

‐0.15
Change in Debt‐GDP Ratio Primary Deficit

Figure 7.3 provides this decomposition. It plots the change in the US debt-GDP ratio in blue.
There is a big decrease in the debt-GDP ratio in the 1950’s so these values are negative. Figure 7.3
98 Chapter 7. Fiscal Policy

also plots the year-by-year values for the primary deficit as a ratio to GDP in red. We see that the
blue line is typically below the red line from 1950 to 1980. Thus, there is a remaining
 source of
1+rt Bt−1
the fall in the debt-GDP ratio. Equation 7.3 says that this remaining source is 1+g t
− 1 Yt−1 and
Figure 7.3 implies that this term is on average negative from 1950 to 1980 because the blue line is
below the red line. This term can only be negative on average if the real return on US government
debt is below the growth rate of GDP.
Figure 7.3 suggests that for part of the post WWII period the growth rate of GDP exceeded the
average interest rate paid on government debt. This finding is an uncomfortable one for economists
equipped with a simple growth model. The reason is that the interest rate is below the growth rate
in such models exactly when the economy is above the Golden Rule capital-labor ratio. Economists
do not believe that the U.S. economy or any advanced economy is above the Golden Rule and is
suffering from having too much capital as discussed in Chapter 3. Thus, a common view is that a
more sophisticated theory that allows a role for aggregate risk to impact growth rates and returns to
assets is needed to adequately interpret this last fact.

7.2 Present-Value Constraint


We will find it useful to develop a present-value budget constraint based upon the period budget
constraint from the previous section. Within the theory, we will take the view that a government is
infinitely lived. With this assumption, there will be no “last period” in which a government must
settle up debts. So, in this regard, a government is not at all like a finitely-lived consumer.
We now derive the present-value budget constraint. First, we write down the period budget
constraint in period t and t + 1. Here we use the term Rt ≡ 1 + rt for the gross interest rate to allow
a more compact presentation.

Bt = Bt−1 Rt + Gt − Tt (7.4)
Bt+1 = Bt Rt+1 + Gt+1 − Tt+1 (7.5)

Substitute Bt from 7.4 into the right hand side of 7.5, after dividing 7.5 by Rt+1 . This produces
equation 7.6.
Bt+1 Gt+1 − Tt+1
= Bt−1 Rt + Gt − Tt + (7.6)
Rt+1 Rt+1
This procedure is similar to that used to derive the present value budget constraint used in consumer
theory. Since the government lives indefinitely, one can apply the same procedure to sequentially
express the period constraints corresponding to time t + 2, t + 3, and so on, up to period t + n in
terms of initial debt and the sequence of primary deficits between period t and t + n. At each step,
one takes the present value of both sides of a given future period constraint, and then substitutes
debt from last period using the expression for accumulated deficits. For example, for debt in
period t + 7 one takes the period budget constraint Bt+8 = Bt+7 Rt+7 + Gt+7 − Tt+7 , divides by the
product of the period discount rates Rt+1 Rt+2 ...Rt+7 substitutes debt Bt+7 from the period-7 analog
of equation 7.6. One can see that this operation, repeated from period t + 1 through n − 1 leads to
equation 7.9, below:
Bt+n Gt+1 − Tt+1 Gt+2 − Tt+2 Gt+n − Tt+n
= Bt−1 Rt + Gt − Tt + + + ... +
Rt+1 Rt+2 ...Rt+n Rt+1 Rt+1 Rt+2 Rt+1 Rt+2 ...Rt+n
(7.7)

To obtain the present value budget constraint of the government we need two more steps.
The first step is to assume that the term Rt+1B···R
t+n
t+n−1
approaches zero as the number n of periods
7.3 Fiscal Policy in the Life-Cycle Model 99

we look into the future increases. This assumption puts some limitations on how fast debt can
grow. More precisely, it assumes that the debt (eventually) grows at a slower rate over time than
the rate of interest. For example, this assumption precludes the government from “rolling over”
existing debt forever. If that were to hold and primary deficits were held to zero, then the debt
would grow at exactly the rate of interest. The second step is to reorganize the equation so that
government outlays (expenditures and interest and initial debt plus interest) are on the left hand
side and government revenues (taxes) are on the right hand side. The following definition contains
this budget constraint.

Definition 7.2.1 The Present Value Budget Constraint of the Government is as follows:

Gt+1 Gt+2 Tt+1 Tt+2


Bt Rt + Gt + + + · · · = Tt + + +··· (7.8)
Rt+1 Rt+1 Rt+2 Rt+1 Rt+1 Rt+2
and states that the present value of government outlays equals the present value of government
revenues. The outlays consist of repayment of initial debt with interest and the present value of
government expenditures. The revenues consist of the present value of taxes.

Because of the assumption that the government is infinitely-lived, the present values involved
in the previous definition involve infinite sums.
We can offer two interpretations for what this budget constraint implies. The first interpretation
is that current and future taxes must pay for both current debt as well all current and future spending.
The second interpretation is perhaps more interesting. It is the assertion that a tax cut without
an eventual spending cut is not really a tax cut. To illustrate this interpretation, consider two
hypothetical plans: a status quo plan and a new tax and spending plan. Suppose that the new plan is
viewed as offering a tax cut in that current taxes are lower under the new plan: Ttnew < Tt . If the new
plan satisfies the government budget constraint and the left-hand side does not change (no change
in spending and no defaulting on the debt), then the right-hand side cannot change. It has to have
the same present value. So a tax cut in period t is not really a tax cut as taxes in some future period
must be raised. This interpretation highlights the point that political pronouncements concerning
proposed tax cuts which are not clearly related to spending cuts may be inconsistent with the present
value budget constraint. Regardless of what interpretations we assign to the present-value budget
constraint, we will view it as a basic restriction on tax, spending and debt plans in that they must
add up so as to satisfy this equation.

7.3 Fiscal Policy in the Life-Cycle Model


In this section we analyze fiscal policy within the life-cycle model. We focus on a special version
of this model. Specifically, we assume that the preference parameter α controlling the preference
for consumption early in life is set to α = 0. This means that consumers will save all available
resources when young to consume in old age as they only care about consumption in old age. This
assumption will allow for a tractable analysis of fiscal policy issues.

U(cy , co ) = α log cy + (1 − α) log c0 = log c0

We will now work out the law of motion for the capital-labor ratio. We do this first without
a government and then with a government that can tax, spend and borrow. We will then see how
taxing, spending and borrowing affects the law of motion. The first equation below states that,
without a government and with α = 0, young agents save all their wages. The logic is that the
government does not take away any of the young agent’s wages in taxes and that they only care
about consumption when old and thus young agents save everything they can. With α = 0 young
100 Chapter 7. Fiscal Policy

agents save all their wages and the law of motion of the life cycle model without government is
simply:
β
kt+1 = wt = (1 − β )Akt
. When there are taxes, young agents save all of their wages after taxes, where wages after taxes are
denoted by the bracketed term (wt − Tyt ).
When there is a government, young agents have two forms in which to hold their savings:
risk-free government debt b and physical capital k.3 Thus, the part of their savings that is held in
physical capital is all their savings (wt − Tyt ) less the part held in government debt bt .
The law of motion for the capital-labor ratio with a government is thus given by:

β
kt+1 = (wt − Tyt ) − bt = ((1 − β )Akt − Tyt ) − bt
The law of motion says that the capital-labor ratio next period equals the wage per young agent
less taxes on the young and less the amount of government debt per young agent issued by the
government.
Note that the timing convention used for government debt is different to that used for the
capital stock. Bt is total debt to be repaid at time t + 1 (accumulated during t) while Kt+1 is capital
accumulated during t, which produces output in t + 1. We have this dual definition to (i) consistency
with the decomposition above, (ii) to illustrate the two conventions.

Figure 7.4: Law of Motion: Fiscal Policy

k t +1 k t +1 = k t
kt +1 = (1 - b ) Aktb

kt +1 = (1 - b ) Aktb - Tyt - bt

kt

The term kt+1 is the capital-labor ratio whereas the term bt is the government debt to labor ratio.
The terms (kt+1 , bt ) can also be interpreted as the capital and debt per young agent.
Figure 7.3 plots a diagram describing the law of motion. This follows the treatment of this law
of motion from the chapter on the life-cycle model. The law of motion without a government is
closely related to the law of motion with a government. They differ by a vertical shift capturing the
3 Inthe model young agents are indifferent between holding government debt or physical capital provided both assets
pay the same return. Thus, when we analyze government debt we will assume both assets have the same risk-free return
and that both assets are riskless. Clearly, we abstract from risk to simplify the theory.
7.4 Three Ways to Finance a War 101

role of taxing young agents and borrowing from young agents. Both taxing and borrowing take
goods out of the hands of young agents that otherwise would have been converted into physical
capital. The graph shows that now there is more than one steady state with a positive capital-labor
ratio.

7.4 Three Ways to Finance a War


We now put the model to work by analyzing an economy that initially is in a steady state with no
government spending, no debt and no taxes. This government then in the current period finds itself
in a protracted “Cold War”. The Cold War lasts forever and requires the government to extract
goods from consumers via taxation to fund the war. The war requires a fixed expenditure each
period equal to g > 0 per young agent in the economy.
There are three proposals on the table for how to finance this war. These are listed below. Policy
1 taxes each young agent g each period. This is just enough to pay for the war each period. Thus,
the government runs a balanced budget (i.e. the primary deficit and the overall deficit each period
is zero) and never borrows. Policy 2 taxes each old agent g each period. This is just enough to pay
for the war each period. Thus, once again the government runs a balanced budget each period and
never borrows.
Policy 1: Tax the Young (Tyt , Tot ) = (g, 0), ∀t ≥ 1
Policy 2: Tax the Old (Tyt , Tot ) = (0, g), ∀t ≥ 1
Policy 3: Debt Finance (Ty1 , To1 ) = (0, 0) and (Tyt , Tot ) = (0, g(1 + rt )), ∀t ≥ 2
Policy 3 is more complicated. Policy 3 does not tax any young or old agent in the first period -
at the very start of the Cold War. Instead, the government borrows an amount g from each young
agent in period 1 to finance its spending. In the second period and in all subsequent periods the
government starts taxing. The government taxes old agents enough to pay for the war in that period
and to pay interest on government debt. The government does not pay off the debt. It just pays the
interest on the debt so that the debt per young agent stays constant at bt = g > 0.

7.4.1 Analysis of Policy 1


We now analyze the economic consequences of financing the war using Policy 1. The analysis
of the effects of any policy in the life-cycle model start by examing the law of motion for capital:
β
kt+1 = (wt − Tyt ) − bt = ((1 − β )Akt − Tyt ) − bt . The law of motion shifts whenever there is a
policy change that changes taxes on the young or the amount of borrowing from the young.
Policy 1 shifts down the law of motion for physical capital by the amount of the tax Tyt = g > 0
on young agents. Taxing the young takes goods out of the hands of young agents that would
otherwise be used to purchase physical capital. The result is that the capital-labor ratio falls over
time until it converges to a lower steady state value. Given that capital falls over time it is also clear
that output and wages falls over time and that the real interest rate increases over time.

7.4.2 Analysis of Policy 2


Policy 2 does not lead to a shift in the law of motion for physical capital. This is because this
policy does not change taxes on the young or borrowing from the young. Remember that young
agents save everything even though they anticipate paying taxes when old. Thus, the capital stock
does not change and neither does output, wages or the real interest rate. Clearly, something must
happen by taxing the old. What happens is clear from the national income and products accounts
identity: C + I + G = Y = F(K, L). Output does not change as capital and labor inputs do not
change. However, government spending goes up and total consumption goes down by exactly the
rise in government purchases of goods for the war.
102 Chapter 7. Fiscal Policy

7.4.3 Analysis of Policy 3


Policy 3 leads to a downward shift of the law of motion by exactly the amount of borrowing per
young agent which equals b = g > 0. Thus, we conclude that the effect on the capital stock is
exactly the same as in Policy 1 because the law of motion shifts downward by the same amount as
in Policy 1. For this reason, the effects on output, investment, wages, interest rates and aggregate
consumption are exactly the same as in Policy 1. It also turns out that the consumption of each
individual agent is the same in Policy 3 as in Policy 1. This is a striking result which may initially
be contrary to intuition.

7.4.4 Why Are Policy 1 and 3 Equivalent?


We now examine the logic for the equivalence of Policies 1 and 3. To do so, we calculate the
present value of taxation on each young agent under Policy 1 and 3. If Policies 1 and 3 impose
the same present value of taxation on each agent, then the budget sets must be the same for each
agent. When budget sets are the same across these two policies, then lifetime consumption choices
must be the same for a rational individual, regardless of the exact nature of their preferences over
consumption.
The present value of taxation is calculated below. We see that although the timing of taxation
differs, the present value is the same in both policies. Thus, these policies simply move an agent
along an unchanged budget line. This does not change the consumption plans that the agent can
purchase.

Tot+1 0
Policy 1 : PV Tax = Tyt + = g+ =g
1 + rt+1 1 + rt+1

Tot+1 g(1 + rt+1 )


Policy 3 : PV Tax = Tyt + = 0+ =g
1 + rt+1 1 + rt+1

What we have just seen is an illustration of an important principle. Economists call this Ricar-
dian Equivalence. More specifically, economists note that many seemingly different government
policies can lead to exactly the same implications for outcomes. Two policies leading to the
same consequences are said to display Ricardian Equivalence. The principle behind the Ricardian
Equivalence result displayed here is that there are many seemingly different tax policies that still
impose the same present value of taxation on each agent, while keeping government spending on
goods the same across the policies. Two such policies then leave budget sets unchanged. Thus,
regardless of the nature of an agent’s preferences, rational choice then implies that consumption
choices must be unchanged across two such policies. It is clear that debt or asset choices are
changed but the all important consumption choices are unchanged.

7.5 Multipliers
In the Great Recession of 2008, policy makers in many high-income countries were considering
policy responses to lift their economies out of recession. One policy response is a temporary
change in government purchases or a temporary change in some taxes. Dynamic spending or tax
“multipliers” describe the ratio of the change in output ∆Yt+n at some horizon n due to an increase
in government spending ∆Gt or taxes ∆Tt in time period t. This section does two things. First,
the logic behind an empirical attempt to estimate dynamic multipliers will be explained. Second,
theory-based multipliers from two very different theoretical models will be calculated.
7.5 Multipliers 103

7.5.1 Empirics
This section describes the basic ideas behind the calculation of tax multipliers in applied work. In
doing so we employ the basic ideas presented in the work of Romer and Romer (2010).4
Consider the first equation in the statistical model below. It relates the change in output
∆Yt = Yt −Yt−1 at time t to the change in (legislated) taxes ∆Tt = Tt − Tt−1 at time t. The statistical
model posits that the change in output is a linear function of the change in taxes plus a disturbance
or shock term εt . If we measure Yt as the log of GDP and Tt as the log of taxes (both at time t), then
∆Yt is the output growth rate and ∆Tt is the growth rate in taxes. The second equation acknowledges
that there is possibily a long list of factors that may impact output, other than taxes. Adding these
up produces the disturbance εt = ∑Ki=1 εti = εt1 + · · · + εtK .

∆Yt = α + β ∆Tt + εt (7.9)


K
εt = ∑ εti (7.10)
i=1
K L
∆Tt = ∑ bti εti + ∑ ωtj (7.11)
i=1 j=1

The third equation posits that the change in (legislated) taxes that is measured in the data is due
to two sources. One source ∑Ki=1 bti εti is that part of the change in taxes occurs in response to the
same sources of variation leading to changes in output - the vector (εt1 , · · · , εtK ). They call this part
the “endogenous tax changes”. For example, policy makers may systematically reduce some taxes
when the economy enters a recession and output is low. The other source of variation captures the
possibility that some legislated tax changes may not be done in response to variables ( the vector
(εt1 , · · · , εtK )) that directly impact output. This is the key assumption in their analysis. They call
this part “exogenous tax changes” and it is represented by the term ∑Lj=1 ωtj .
Now combine the three equations above into equation (*) below:
L K K
(∗) ∆Yt = α + β ( ∑ ωtj ) + [β ∑ bti εti + ∑ εti ]
j=1 i=1 i=1

Equation (*) is key in the analysis of Romer and Romer (2010). They use the “narrative method”
to try to find and calculate in quarterly US data the change in legislated federal taxes (the term
j
∑Lj=1 ωt in equation (*)) that is not in response to the factors directly impacting output. They then
use linear regression methods to estimate versions of this equation under the assumption that the
calculated change in legislated tax term is uncorrelated (i.e. not systematically related) with the
disturbance term in square brackets.5 An estimate of the parameter β is then their estimate of the
contemporaneous “government tax multiplier”.
4 See Christina Romer and David Romer (2010), The Macroeconomic Effects of Tax Changes: Estimates Based on a
New Measure of Fiscal Shocks, American Economic Review, 100, 763-801.
5 The technique of linear regression, taught in an econometrics course or a statistics course, produces an unbiased

estimate of the unknown parameter β when the disturbance term [β ∑K i i K i


i=1 bt εt + ∑i=1 εt ] in equation (*) is uncorrelated
j
with the regressor term ( ∑Lj=1 ωt ), among other key assumptions. The basic problem can be illustrated with an intuitive
example outside the realm of economics.
Consider the problem of a scientist who wants to measure the contribution of water to plant growth and has data on
daily plant growth and daily quantities of water applied. A linear regression technique such as ordinary least squares
will draw a line of best fit through the scatter plot relating plant growth to water applied. The slope of this line may
overestimate the marginal contribution of water to plant growth if the plant was watered more on days of high sunshine
than on days of low sunshine. If so, then the slope in question would measure the contribution of water and part of the
effect of sunshine instead of just the water. This example highlights a situation in which the regressor term (i.e. water) is
correlated with factors (i.e. sunshine) that govern the disturbance term.
104 Chapter 7. Fiscal Policy

How do Romer and Romer (2010) try to separate legislated tax changes into endogenous and
exogenous parts? They give two examples of exogenous tax changes. The Clinton era tax increase
of 1993 is an example where they claim that taxes were raised not because policy makers felt
that the economy needed to be restrained but because policy makers felt it was prudent and might
increase long-run growth. Thus, roughly put, the tax change was not in response to current shocks
impacting output. The Kennedy-Johnson era tax cut of 1964 is an example where the claim is that
it was put in place to help long-run growth. They arrive at this clasification by reading various
accounts in government publications of the motivation behind the tax changes - hence they label the
method the “narrative method”. An example of an endogenous tax change is a change motivated by
smoothing out a recession. For example, Romer and Romer put the tax cut of 1975 in this category.
Presumably, the temporary payroll tax cut associated with the Great Recession of 2008 would be
classified as endogenous.
The results reported in Romer and Romer (2010) are based on a more elaborate version of the
simple statistical model described above. They employ the equation ∆Yt = α + ∑M i=0 βi ∆Tt−i + εt
which allows tax changes in the current period ∆Tt and previous periods ∆Tt−i for i ≥ 1 to impact
current output changes. The variable ∆Tt−i is measured using the “exogenous part” of tax changes
from their narrative approach. Thus, it corresponds to the term ∑Lj=1 ωtj used in the simple statistical
model.

M
∆Yt = α + ∑ βi ∆Tt−i + εt
i=0

Their estimate of the multiplier n quarters after a exogenous tax increase equal to 1 percent of
GDP is given by summing the first n coefficients so that the dynamic multiplier at horizon n quarters
ahead is β0 + β1 + · · · + βn . Romer and Romer (2010, Figure 4) find that (i) the contemporaneous
multiplier is zero (i.e. β0 = 0), (ii) the multiplier falls as the horizon n increases and (iii) the
multiplier is approximately −3.0 eight quarters ahead when n = 8. Thus, they find that exogenous
tax increases are contractionary in US data. They lead to decreases in output of greater size, eight
quarters ahead, than the size of the tax increase because the sum of the estimated coefficients is
below −1.

7.5.2 Theoretical Models for Multipliers


We will consider two theoretical models: the Keynesian model and the life-cycle model. We focus
attention on the government tax multiplier produced in these theoretical models because we just
reviewed the empirics behind empirical government tax multipliers. The evidence indicated that
contemporaneous tax multipliers are approximately zero and that tax multipliers eight quarter ahead
are −3. Thus, it is relevant to ask if the theoretical models are consistent with this evidence.

Keynesian Model
We have already worked out the tax multipier from the Keynesian model from Chapter 6.4. The
first equation below is output in the Keynesian model. The second equatoin is the tax multiplier
that is implied by the first equation. Thus, the theory says that the contemporary effect of an
increase in taxes, holding spending constant, is to decrease output. There are no dynamic effects as
the Keynesian model is static in that variables at different time periods do not enter the equation.
The empirical multiplier from Romer and Romer (2010) indicated that the contemporaneous tax
multiplier was zero but the tax multiplier was −3.0 eight quarters ahead.

a − bT + I + G
Y=
1−b
7.5 Multipliers 105

∆Y b
=− <0
∆T 1−b

Life-Cycle Model
To generate multipliers in the life-cycle model we compare a benchmark scenario to an alternative
scenario. The alternative scenario features an increase in goverment spending in the initial period -
say period t = 0. We calculate output paths over time in both scenarios. If the output is greater in
period t in the alternative scenario compare to the benchmark scenario so that ytalter − ytbench > 0,
then we say that the government spending multiplier is positive in period t.6 We want to figure out
whether multipliers are positive or negative at different time horizons.
We keep the benchmark scenario simple by assuming that the economy is in a steady state with
no government spending, no taxes and no debt. Thus, the output-labor ratio and the capital-labor
ratio are constant over time in the benchmark scenario. The alternative scenario entails positive
government spending galter0 > 0 in period zero but no government spending in all future periods (i.e.
gtalter = 0 for t ≥ 1). Thus, we analyze a one-time increase in spending. We assume for convenience
that the government spending is not a substitute for private consumption (e.g. funding NASA to
send a space craft to Mars is probably not a close substitute for restaurant meals). This assumption
is consistent with those that Keynesian economists have often found to be convenient.
Following the logic of the model, any government spending plan has to be associated with a tax
plan that satisfies the government’s intertemporal budget constraint. There are many tax plans to do
this. Each tax plan will have associated with it a whole bunch of government spending multipliers
at different time horizons because the output path depends, in general, on the tax plan. We pick a
simple plan and set Ty = To = galter
0 /2 at time t = 0 with no taxes on any young or old agent at any
future date. Thus, the government runs a balanced budget at all points in time and at time t = 0 the
young and old agents equally share the cost of government spending.
What happens in the alternative scenario? To figure this out consider what happens to the law
of motion in the alternative scenario compared to the benchmark scenario. The general equation
for this law of motion is provided below and was graphed previously in Figure 7.3. It is clear that
the law of motion graphing next periods capital-labor ratio as a function of this periods capital
labor ratio shifts downward for exactly one period (in period t = 0) and then shifts back to the
original position. This occurs as young agents under the alternative scenario have less to save as the
government is now taxing them Ty = galter 0 > 0 in period t = 0. The government does not borrow
from the young so bt = 0 in all periods.

β
kt+1 = wt − Tyt − bt = (1 − β )Akt − Tyt − bt

The consequence is that in period t = 1 the capital-labor ratio is smaller in the alternative
scenario than in the benchmark. Thus, output is lower in the alternative scenario compare to the
benchmark as ytalter = F(ktalter , 1) < ytbench = F(ktbench , 1). Over time the capital-labor ratio in the
alternative scenario converges to the level in the benchmark scenario from below. Thus, output in
the alternative scenario also converges over time to that in the benchmark scenario from below. The
outcome is that the government spending multiplier is negative at horizon t = 1, 2, ... and is exactly
zero at horizon t = 0 in the life-cycle model. In the Keynesian model the multiplier associated with
a balanced-budget spending increase was positive. More specifically, the multiplier at horizon t = 0
in the Keynesian model is positive. The empirical results in Romer and Romer (2010) were that the
contemporaneous multiplier for taxes was zero and that the multiplier was negative for horizons
several quarters ahead.
6 One could also calculate a government spending multiplier at horizon t as (ytalter − ytbench )/(galter − gbench ).
0 0
106 Chapter 7. Fiscal Policy

The only way to get a positive multiplier from a tax increase in the life-cycle model is to get
labor or capital inputs to increase beyond their levels in the benchmark model. The model has
young agents always supplying the same amount of labor input regardless of the wage rate. Thus,
without a substantial change to the model ( a change that alters attitudes to leisure and labor) the
sign of the multiplier boils down to how the taxes that finance government spending affect saving.
The young agents at time t = 0 are now worth less over their lifetime as the government taxes them.
Thus, they save less than in the benchmark scenario. This leads to negative spending multipliers at
dates t = 1, 2, ....
We acknowledge that allowing for a meaningful labor-leisure choice is an important theoretical
route to positive government spending multipliers. Higher spending financed by higher (lump-sum)
taxes may lead agents to work more and produce more output when leisure is a normal good. The
mechanism is that the taxes make agents poorer when the government spending is not a substitute
for private consumption. If leisure is a normal good then the mechanism is that taxes make agents
poorer and this leads to less leisure and more labor, which helps to increase output.

7.6 Social Security Systems


Governments in most countries in the world have instituted social security systems. A common
feature of many of these systems is that the government forces workers to pay taxes on labor
earnings and then the government uses these tax receipts to pay benefits for current retirees’
pensions and health care or payments to disabled workers. The United States has had a social
security system with some of these features since the Social Security Act of 1935. A key component
of the current U.S. system is that retirement, disability and old-age health care benefits are largely
funded on what is called a pay-as-you-go basis.
Under a pure pay-as-you-go system, the benefits paid to current retirees are financed exclusively
by taxes paid by current workers. Thus, under a pure pay-as-you-go system the taxes paid by current
workers are not invested in claims on private firms (e.g. stocks and bonds) and then later sold to pay
benefits to these workers when they retire. Systems without the accumulation of assets to finance
future benefits are sometimes called unfunded systems. This terminology is apt in that, under such
an unfunded system, a birth cohort does not build up financial assets to be used later on to fund
future retirement payments to the surviving cohort members later in life. Instead, the funding for
prospective benefits in an unfunded system relies upon the political will of future governments to
tax future generations of workers.
The remainder of this section will do two things. First, a theoretical analysis of a pay-as-you-go
social security system is presented within the life-cycle model. Second, some features of the US
social security system are described.

7.6.1 Social Security: Theory


In this section we will use the life-cycle model to analyze the effects of starting a pay-as-you-go
system from a steady state in which there initially was no social security system, no taxation, no
government spending and no debt. In the language of the previous section, a pay-as-you-go social
security system is a tax policy in which current workers are charged a tax of s > 0 and every current
old person is charged a tax of −s < 0. A negative tax is called a transfer. Thus, the policy is stated
simply as (Tyt , Tot ) = (s, −s), ∀t ≥ 1. Clearly, this is resource feasible as there are by assumption as
many old people as young people in the life-cycle model.
We will not analyze the question of whether or not such a policy is politically feasible. One view
of political feasibility is whether, given some voting scheme, the agents within the model would
vote for the continuation of such a scheme over a well-specified alternative scheme in each model
period. It is clear that old agents within the life-cycle model always prefer to stick with a system as
7.6 Social Security Systems 107

long as the transfer they receive in the existing system exceeds the transfer in the alternative system.
This is because, in the model, the payment to capital which the old agents recieve only depends
on the total quantities of factor inputs (K, L) and these payments are not altered by whether or not
the old agents receive a transfer. The policy preferences for the young agents at different dates are
trickier to determine but can be worked out for some simple policies.
To understand the consequences of moving to policy (Tyt , Tot ) = (s, −s) from a policy of no
taxation, no government transfers, no spending and no debt , we highlight how the new policy
shifts the law of motion for the capital-labor ratio. The law of motion is kt+1 = (wt − Tyt ) − bt =
β
((1 − β )Akt − Tyt ) − bt . The insight from the previous section is that government policy shifts
down the law of motion by exactly the taxes on young agents Tyt plus the borrowing from young
agents bt . Thus, under the pay-as-you-go social security system the law of motion shifts down by
exactly Tyt = s > 0.

Figure 7.5: Law of Motion: Social Security

k t +1 k t +1 = k t
kt +1 = (1 - b ) Aktb

kt +1 = (1 - b ) Aktb - s

kt

The theory then predicts that over time this causes the capital-labor ratio to fall to a lower
steady-state level. Intuitively, this raises the utility of currently old agents at the time of the policy
change but lowers the utility of all agents born far in the future. These agents are born into a world
with lower capital, wages and output. For this intuition to prove correct within the model it is
important that the initial steady state is below the Golden rule steady state. As long as this holds
then the new steady state will not only have lower capital, wages and output but will also have
lower consumption.
It is useful to try to get an intuitive idea of what social security does in this model. One way to
do so is to calculate the present value of taxation on an agent. We do so below. The calculation
shows that a pay-as-you-go system is equivalent to a net tax in present-value terms exactly when
the real interest rate is positive. Thus, from this perspective the social security system within this
model amounts to providing a gift of s > 0 to each initial old agent that is paid for by imposing a
net tax on all current young agents and on all agents born in the future.
Tot+1 −s srt+1
PV Tax = Tyt + = s+ = >0
1 + rt+1 1 + rt+1 1 + rt+1
108 Chapter 7. Fiscal Policy

A second way to view social security in this model is to say that social security is equivalent to
the government forcing all agents, other than the initial old agents, into giving the government a
zero real interest rate loan. This interpretation comes from the point that agents give up s when
young and get back s when old so that no “interest” is paid in the model social security arrangement.
It also comes from the term srt+1 in the numerator of the equation above. This is the interest not
paid by the social security system. Clearly, when the market interest rate is positive, these agents
would have no benefit from participating in this scheme!
One can ask whether a pay-as-you-go system is a good idea in some normative sense within the
model. Although it is difficult to win battles on normative questions, it is important to discuss them.
A pay-as-you-go system in the life-cycle model helps some agents but hurts others. So moving
to this system does not result in a Pareto improvement as long as the real interest rate without the
system is positive each period. This conclusion about Pareto efficiency is a consequence of the
Proposition established in section 4 of Chapter 5. Thus, the life-cycle model does not give a ringing
endorsement for adopting social security systems on welfare grounds.
It is key to point out that the life-cycle model does not have any risks facing agents that the
model social security system helps to insure. The absence of risk is one reason for the relatively
simple conclusions reached for how social security impacts the economy. Recall that social security
is essentially a way to redistribute wealth across generations within this model. Some generations
get a positive transfer, whereas others get a negative present-value transfer. Social security plays no
insurance role in the life-cycle model.
Some economists have argued that there is a welfare argument to be made for government
provision of insurance. For example, James Mirrlees (1995, p. 384) provides two reasons for
government provision:7
From the point of view of insurance, there seem to me to be two compelling theoretical
arguments for having the State rather than the market provide a wide range of insurance, for
old-age pensions, disability and sickness, unemployment and low income: the first is that the
market handles adverse selection badly. The second is that, even if adverse selection were
not important, people should take out insurance at an age when they are incapable of doing
so rationally, namely zero.

7.6.2 Social Security: Some US Facts


The Social Security Act of 1935 set up a system whereby workers and their employers were taxed
on the wage income of workers and these revenues were used to fund benefit payments. Since
the 1930’s the U.S. social security system has changed in many ways. For example, the original
system had an old-age (OA) benefit - a pension paid to those beyond a set retirement age. The
current system has old-age (OA), survivors insurance (SI), disability insurance (DI) and hospital
insurance (HI) benefits. Moreover, the type of benefits provided under HI, more commonly called
Medicare, has expanded over time. The most recent expansion added a perscription drug benefit
to the Medicare system without altering the earnings tax rate that funds these benefits. As of
2010, the annual benefit payments in the social security and medicare system combined equaled
approximately 7 percent of GDP. Currently, well over 90 percent of the U.S. work force is required
to pay taxes supporting the benefits paid by the social security and medicare system.

7 Mirrlees (1995), Private Risk and Public Action: The Economics of the Welfare State. European Economic Review

39: 383-97. Mirrlees received the Nobel Prize in Economics in 1996 for his work on optimal income taxation.
7.7 Overview 109

Table 7.1: Social Security in the US


Benefit Year 2010 Tax Rate
OA + SI 1935, 1939 10.6
DI 1956 1.8
HI (Medicare) 1965 2.9
Total - 15.3

The tax rates funding the various benefits have increased over time. There is a proportional tax
rate funding OASI benefits (10.6 percent), DI benefits (1.8 percent) and HI benefits (2.9 percent).
These are the total tax rates - half this tax is paid by the employee and half by the employer. These
tax rates apply up to an upper bound or cap on earnings. Workers who have earnings above the cap
pay no extra taxes on the portion of earnings beyond the cap. The earnings cap on OASIDI taxes
was 106,800 dollars in 2010. There is currently no earnings cap on HI taxes. After the recession of
2008, the combined OASIDI tax rate described above was temporarily lowered for a period of two
years.
The benefits listed in Table 7.1 are largely funded on a pay-as-you-go basis. A system that
is funded purely a pay-as-you-go basis collects taxes on current workers and pays out all taxes
to current beneficiaries. Thus, under such a system there is no trust fund built up to pay current
workers future benefits. The U.S. system has a small trust fund built up for each distinct benefit
that are being rapidly drawn down. The sense in which these trust funds are small and the speed
with which they are being drawn down can be gauged from reading the annual Trustees Reports for
the Social Security and Medicare Systems. These trust funds hold only U.S. Treasury debt. As one
part of the Federal Government of the U.S. is holding the debt of another part of the government,
there has been a debate about whether or not such assets should be viewed as “backing” for future
benefit payments.
The Trustees Reports in each year from the last several decades has forecasted that, without
changes in tax rates or benefit formulas, there will be an impending crisis. This report forecasts
the year in which each trust fund will run out of money. Forecasts from 2011 indicate that the
combined OASIDI trust fund will be exhausted by 2036. The problem is that benefit payments as a
fraction of GDP or total earnings are projected to increase over time, whereas tax payments implied
by current and future tax rates are projected to be stable as a faction of GDP or total earnings. Thus,
once the trust funds are exhausted annual tax revenues will be only a fraction of forecasted annual
benefit payments. The trend in benefits and taxation is driven by demographic change - people are
living longer and the birth rate is not projected to increase. Thus, the ratio of workers per retiree
has declined over time and is projected to continue this trend.

7.7 Overview
We take away three main fiscal policy lessons from theory.
1. Changes in government spending can in theory have a powerful impact on the economy. This
was illustrated in section 7.4 by comparing the economy with no government spending to
the economy with a constant positive amount of government spending that was financed by
always taxing young agents. This change in government spending was contractionary in that
it reduced output and aggregate physical capital in all future periods.
2. Different tax policies that finance the same government spending plan over time can have
exactly the same impact on the economy. The principle underlying this result is called
Ricardian Equivalence. Two policies that finance the same spending plan will have the same
impact when they impose the same present value of taxation on each household. This situation
was illustrated by Policy 1 and Policy 3 from section 7.4. This result holds independently of
the nature of the utility function for agents in the life-cycle model, but the argument for the
110 Chapter 7. Fiscal Policy

result does utilize the assumption that taxes are lump-sum taxes and not taxes that depend on
income (e.g. proportional or progressive income taxes).
3. Shifting the burden of paying for a fixed government spending plan onto future generations is
contractionary in the life-cycle model. This point was illustrated by comparing Policy 1 and
Policy 2 in the war finance example. It was also illustrated in comparing no social security
system to a pay-as-you-go social security system. Thus, starting up a new pay-as-you-go
social security system in the life-cycle model amounts to shifting tax burdens onto future
generations and away from the current generation.

7.8 Key Concepts


A budget deficit is a situation in which current government spending plus interest on
government debt exceeds current tax revenues.
A primary budget deficit is a situation in which current government spending exceeds
current tax revenues.
The government’s present value budget constraint states that the present value of current
and future government spending plus the value of the current debt plus interest can be no
more than the present value of current and future government taxes.
A pay-as-you-go social security system is a system in which current social security benefits
are financed exclusively by social security taxes on current workers. Thus, there is no trust
fund of government or private-sector assets that back social security payments in a pure
pay-as-you-go social security system.

Вам также может понравиться