Game Theory

BASICS OF GAME THEORY
Jerome Renault, M2 Ecomath TSE

1
1
This version: August 27, 2013. https://sites.google.com/site/jrenaultsite/lecturenotes
2 Game Theory TSE 2013 J. Renault
Contents
Introduction 5
1 Strategic games 7
1.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Dominant strategy equilibrium . . . . . . . . . . . . . . . . . . 13
1.3 Nash equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3.1 Denition and rst properties . . . . . . . . . . . . . . 14
1.3.2 Existence of a Nash equilibrium . . . . . . . . . . . . . 16
1.3.3 Iterated elimination of strictly dominated strategies . . 20
1.4 Mixed strategies . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.4.1 Finite games and mixed strategies . . . . . . . . . . . . 23
1.4.2 Elimination of strategies strictly dominated by a mixed
strategy . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.4.3 Generalization of the mixed extension . . . . . . . . . . 29
1.5 Rationalizability . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.6 Zero-sum games . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.6.1 Denition, value and optimal strategies . . . . . . . . . 31
1.6.2 Links with Nash equilibria and the MinMax theorem . 35
2 Extensive games 39
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.3 Associated strategic form . . . . . . . . . . . . . . . . . . . . . 42
2.4 Games with perfect information . . . . . . . . . . . . . . . . . 43
2.5 Behavior strategies . . . . . . . . . . . . . . . . . . . . . . . . 45
2.6 Sequential Rationality . . . . . . . . . . . . . . . . . . . . . . 48
2.7 Subgame-perfect, Bayesian-perfect and Sequential Equilibria . 51
2.7.1 Subgame-perfect equilibria . . . . . . . . . . . . . . . . 51
3
2.7.2 Bayesian-perfect equilibria . . . . . . . . . . . . . . . . 52
2.7.3 Sequential equilibria . . . . . . . . . . . . . . . . . . . 52
3 Bayesian games and games with incomplete information 57
3.1 Modeling incomplete information . . . . . . . . . . . . . . . . 57
3.2 Bayesian games . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.3 Games with incomplete information . . . . . . . . . . . . . . . 60
4 Correlated Equilibrium 63
4.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2 Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3 Canonical correlated equilibrium . . . . . . . . . . . . . . . . . 67
4.4 Extensive-form correlated equilibrium and communication equi-
librium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5 Introduction to repeated games 73
5.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.2 Model of standard repeated games . . . . . . . . . . . . . . . . 75
5.2.1 Histories and plays. . . . . . . . . . . . . . . . . . . . . 75
5.2.2 Strategies . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.2.3 Payos . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.3 Feasible and individually rational payos . . . . . . . . . . . . 78
5.4 The Folk theorems . . . . . . . . . . . . . . . . . . . . . . . . 79
5.4.1 The Folk theorems for G
. . . . . . . . . . . . . . . . 79
5.4.2 The discounted Folk theorems . . . . . . . . . . . . . . 81
5.4.3 The nitely repeated Folk theorems . . . . . . . . . . . 83
5.5 Extensions: examples . . . . . . . . . . . . . . . . . . . . . . . 83
6 Exercises 89
6.1 Strategic games . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.2 Extensive-form games . . . . . . . . . . . . . . . . . . . . . . . 100
6.3 Bayesian games and games with incomplete information . . . . 114
6.4 Correlated equlibrium . . . . . . . . . . . . . . . . . . . . . . 115
6.5 Repeated Games . . . . . . . . . . . . . . . . . . . . . . . . . 118
Introduction
A strategic interaction is a situation where:
a) there are several persons (in a broad sense: physical persons, juridical
persons, animals, softwares, automata...) called players,
b) each of these players has to do something (choice of actions or strate-
gies),
c) the utility (happiness, money transfer,...) that each player will get
from the interaction does not only depend on his own choice, but may also
depend on the choices of the other players.
Game theory studies strategic interactions, called games. Such situations
are extremely frequent, and in social sciences almost all studied phenomena
have a strategic aspect. The following are some examples of games.
1. An auction to buy an indivisible good: a bidder prefers to be the
unique person to submit a price, and if it is not the case, he will prefer the
other proposed prices to be very low.
2. An oligopoly where several rms selling the same good have to choose
their own price: a rm usually prefers that the other rms put high prices.
3. An election, e.g. the presidential election in France.
4. Fun games (games in the usual sense), such as chess or poker.
One could add price discrimination, insurance policy and bonus-malus
contracts, nancial markets etc. Games are indeed so widely present that it
is dicult to say something both general and useful in applications. Here
are a few main questions:
- How to model a strategic interaction ?
- Is it possible to determine what rational players should play ? What
is the meaning of playing strategically or playing well ?
- When is it the case that strategic play leads to good (socially optimal)
5
outcomes ? How can we construct mechanisms leading to such games, and
avoid to play other games ?
These notes
1
are not more than a introduction to non cooperative game
theory, there are no chapters dedicated to cooperative games. The objective
is to introduce and study strategic concepts, and mathematical formalism
and rigour will largely be used. This text also contains interpretations of
the situations or concepts and a few economic examples, but nothing is said
about interpretations or applications to biology (evolution theory), to com-
puter science and algorithmic game theory (cryptography, automata, conges-
tion models...).
References:
A Course in Game Theory: M.J. Osborne and A. Rubinstein. MIT Press
1994.
Game Theory: Analysis of Conict, R.B. Myerson. Harvard University
Press, 1991.
Game Theory, D. Fudenberg and J.Tirole. MIT Press, 1991.
Game Theory for Applied Economists, Gibbons. Princeton University
Press 1992.
Stability and perfection of Nash equilibria, E. Van Damme. Springer 1991.
1
Thanks to Stephen Wol for many language corrections.
Chapter 1
Strategic games
Strategic games are used to study strategic interactions with only one stage
(one-shot games), and are also fundamental when we do not want to take
into account the explicit time structure of the interaction.
1.1 Model
Denition 1.1.1. A strategic game (also strategic-form game, or normal-
form game) is described as G = (N, (A
i
)
iN
, (g
i
)
iN
) where:
a) N is a non empty set called the set of players
b) for each player i in N, A
i
is a non empty set called the set of actions
(or strategies) of player i.
c) for each player i in N, g
i
is a mapping from
jN
A
j
to IR called the
payo function of player i.
One can think of a strategic game as a simultaneous one-shot game.
More precisely, the interaction is the following. Each player i has to select
an action a
i
in A
i
. The choices of the players are simultaneous. At the end
of the game, if each player j has chosen a
j
in A
j
, the payo (or utility) to
each player i is given by g
i
(a), where a = (a
j
)
jN
. The goal of each player is
to maximize his own payo. All players know G.
Remarks 1.1.2. For the interpretation, it is not strictly speaking necessary
that the choices of actions are simultaneous. It is indeed enough that at the
moment when a player chooses his action, this player is not aware of the
possible choices already made by the other players. For example, each player
7
may write his selected action in a sealed envelope, and nally all the envelopes
are collected and opened by a referee.
It is sometimes also considered that all players know that all players know
G, and also that all players know that all players know that all players know
G, etc. When these considerations are assumed ad innitum, the game G is
said to be common knowledge among the players. This happens for example
when a referee publicly explains the rules of the game to the players before
they play.
Moreover, it is sometimes considered that all players are clever or ra-
tional, and that all players know that all players are rational, and that all
players know that all players know that all players are rational, etc. up to
common knowledge of rationality. However, be careful it is far from easy to
dene what rationality means here.
In this text we wont pay much attention to these considerations on com-
mon knowledge and rationality. We will study and compute well-dened
mathematical concepts, which are meaningful to these interpretations.
Lets see a few examples of strategic games.
Example 1.1.3. There are two players: N = {1, 2}. Player 1 has to select
a line which may be either Top or Bottom, we have : A
1
= {T, B}. Player 2
has to select a column Left or Right, we have : A
2
= {L, R}. The payos,
i.e. the mappings g
1
and g
2
, are given by the following matrix:
L R
T
B
_
(1, 1) (3, 0)
(0, 3) (0, 0)
_
The entries of the matrix represent elements of A
1
A
2
. In each entry there
is a couple of real numbers: the rst component is the payo to player 1, the
second is the payo to player 2. For example in the (T, R) cell we read (3, 0):
it means that g
1
(T, R) = 3 and g
2
(T, R) = 0. In the entry (B, R) one reads
(0, 0): consequently g
1
(B, R) = 0 and g
2
(B, R) = 0, etc. This (bi-)matrix is
thus a convenient tool to represent the payo functions.
A strategic game with 2 players, each of them having a nite number
of actions, will almost always be represented by a matrix as in the above
example. Player 1 will choose a line, player 2 will choose a column, the rst
component of the selected entry will be the payo of player 1 and the second
component the payo of player 2.
J.Renault Game Theory TSE 2013 9
Example 1.1.4.
L R
T
B
_
(1, 0) (0, 1)
(0, 0) (1, 1)
_
What should player 1 play ?
Example 1.1.5. Matching Pennies
L R
T
B
_
(1, 1) (1, 1)
(1, 1) (1, 1)
_
Quite often, a strategic interaction contains several stages, and we will
see in chapter 2 how these games can still be analysed with strategic games.
Here is a simple example.
Example 1.1.6. There are 2 players, and the strategic interaction is given
by the following tree:
@
@
@
@
@
@
J
J
J
J
J
J
J
J
J
J
J
J
P2
P1
P2
(10, 9) (0, 10) (1, 2) (3, 1)
L R
l
2
r
2
l
1
r
1
x
0
x
1
x
2
x
3
x
4
x
5
x
6
c
c c
Such a tree is interpreted as follows. The game starts at the highest node
x
0
, which is called the root of the tree. At this node player 1 is playing and
has to choose between L and R.
Suppose that player 1 chooses L at x
0
. Then the game goes to x
1
. Player
2 now has to play, he knows that the game is at x
1
, so he knows that player
1 has chosen L. Then player 2 has to choose between l
1
and r
1
. If he selects
l
1
, the game is over and the payo if (10, 9). As before, the rst component
is the payo to player 1 (here 10), and the second the payo to player 2 (here
9). Finally, if player 2 chooses r
1
, then the game is over and the payo is
(0, 10).
Suppose now that player 1 plays R at x
0
. Similarly, the game goes to
x
2
, player 2 has to play and knows that the game is at x
2
. He has the
choice between l
2
and r
2
. In each case, the game is over and the payos are
mentioned under the terminal node which is reached.
We have just described a strategic interaction having several stages by
means of a tree, we will call it an extensive-form game. It is however possible
to represent this game as a simultaneous one-shot game, i.e. as a strategic
game, as follows.
The set of players is N = {1, 2}. The set of strategies of player 1 simply
is A
1
= {L, R}. The situation is dierent for player 2. Imagine that before
playing the game, player 2 wants to describe to a friend his strategy: he
should specify what he will play if player 1 plays L, but also what he will
play if player 1 plays R. Consequently, he should communicate an element
of {l
1
, r
1
}, and an element of {l
2
, r
2
}. We thus set: A
2
= {l
1
, r
1
} {l
2
, r
2
} =
{(l
1
, l
2
), (l
1
, r
2
), (r
1
, l
2
), (r
1
, r
2
)}. And we determine the payo functions by
following, for each strategy pair in A
1
A
2
, the play induced on the tree by
the strategy pair. We thus get:
(l
1
, l
2
) (l
1
, r
2
) (r
1
, l
2
) (r
1
, r
2
)
L
R
_
(10, 9) (10, 9) (0, 10) (0, 10)
(1, 2) (3, 1) (1, 2) (3, 1)
_
Fix for a while a strategic game G = (N, (A
i
)
iN
, (g
i
)
iN
). We will always
use the following notations.
Notations 1.1.7.
A =
iN
A
i
.
An element a = (a
i
)
iN
in A is called an action (or strategy) prole. Fix now
a player i in N, this player will sometimes simply be denoted by Pi. We put:
A
i
=
jN\{i}
A
j
.
For a = (a
j
)
jN
A, we write a
i
= (a
j
)
jN\{i}
A
i
. a
i
corresponds
to the action prole played by all players but player i. With a small abuse
of notation, we will write a = (a
i
, a
i
) when we want to focus on player is
action. In general, the exponent i represents all the players except player
i.
Finally, we will denote by g the vector payo function of the game. More
precisely, g is the mapping from A to IR
N
which associates to every action
prole a the vector payo g(a) = (g
i
(a))
iN
in IR
N
.
We now get interested in the signication of playing well in a normal
form game. In example 1.1.3, it seems clear that player 1 should play T
because his payo will be 1 or 3 rather than 0. Similarly, player 2 should
play L. So in example 1.1.3 the action prole (T, L) emerges as the unique
rational issue of the game.
Let us consider now example 1.1.4. If player 2 is rational, he is going
to play R to gain 1 rather than 0, independently of the move of player 1.
Player 1 being clever, should anticipate this and play B and not T. The
prole (B, R) appears here to be the unique rational issue of the game.
It is a priori not clear to know what should be played in example 1.1.5,
we will come back to it later. Let us nally consider example 1.1.6. The
following argument is based on the game tree. If the game is at x
1
, player 2
is going to play r
1
to get a payo of 10 instead of 9. If the game is at x
2
,
player 2 will choose l
2
to get a payo of 2 instead of 1. Player 1 should
anticipate this, and consequently choose R at x
0
to get a payo of 1 instead of
0. Thus (R, (r
1
, l
2
)) seems to be the rational issue of the game. We will come
back later on this argument called backwards induction and its validity.
It should be clear that the 3 previous paragraphs follow common sense but
have no mathematical sense yet since we have not yet dened any solution
concept for strategic games, nor have we dened the meanings of rational
or playing well.
We now give a few general denitions.
Denition 1.1.8. A strategic game is nite if the set of players and the sets
of actions are all nite.
Denition 1.1.9. Fix i in N, and two actions a
i
and b
i
of player i in A
i
.
b
i
strictly dominates a
i
if : a
i
A
i
, g
i
(b
i
, a
i
) > g
i
(a
i
, a
i
).
b
i
weakly dominates a
i
if :
_
a
i
A
i
, g
i
(b
i
, a
i
) g
i
(a
i
, a
i
),
and a
i
A
i
, g
i
(b
i
, a
i
) > g
i
(a
i
, a
i
).
a
i
is strictly (resp. weakly) dominated if there exists a strategy c
i
in A
i
such that c
i
strictly (resp. weakly) dominates a
i
.
The next example, called the prisoners dilemma is very famous.
Example 1.1.10.
L R
T
B
_
(1, 1) (1, 2)
(2, 1) (0, 0)
_
Strategy T of player 1 is strictly dominated by strategy B. Regarding player
2s actions, L is strictly dominated by R.
Numerous stories can illustrate this strategic interaction (prisoners hav-
ing to confess or deny a crime, nuclear arms race,...), let us just mention here
the following version. Initially, player 1 possesses an apple, and player 2 has
a banana. Both players should simultaneously decide whether to keep their
fruit or to give it to the other player (to give corresponds to action T for
player 1 and to action L for player 2). It happens that player 1 prefers the
banana and player 2 the apple. More precisely, each player has the following
preferences, strictly ranked from the best one to the worst one: to have both
fruits comes rst, then comes having his favorite fruit only, next is having
his initial fruit only, and the worst is getting no fruit at all.
The following notion will play a great role in the sequel.
Denition 1.1.11. Fix i in N, a
i
in A
i
, and a
i
in A
i
. We say that a
i
is
a best reply (or best response) of player i against a
i
if:
b
i
A
i
, g
i
(a
i
, a
i
) g
i
(b
i
, a
i
).
In words, a
i
is a best reply against a
i
if player i should play a
i
if the
other players play according to a
i
. This is equivalent to :
g
i
(a
i
, a
i
) = max
b
i
A
i
g
i
(b
i
, a
i
).
The existence of (at least one) best reply of player i against a
i
is equivalent
to the existence of the maximum of the set {g
i
(b
i
, a
i
), b
i
A
i
}. When A
i
is
itself nite, this maximum always exists so player i always has at least one
best reply against any action prole in A
i
. In some cases, the maximum
may not exist, and player i does not have any best reply against a
i
. In
general, it is often the case that several best replies exist. For example in
example 1.1.6, the best replies of player 2 against L are (r
1
, l
2
) and (r
1
, r
2
).
Remark that a strictly dominated strategy can never be a best reply.
The rst solution concept we introduce is the one of equilibrium in dom-
inant strategies.
1.2 Dominant strategy equilibrium
Denition 1.2.1. Fix i in N, and a
i
in A
i
.
a
i
is a dominant strategy of player i if whatever the actions of the other
players, player i should play a
i
, i.e. if:
a
i
A
i
, b
i
A
i
, g
i
(a
i
, a
i
) g
i
(b
i
, a
i
).
Equivalently, a dominant strategy of player i is a strategy of this player

which is a best reply against any other strategy of the other players.
Denition 1.2.2. An equilibrium in dominant strategies of G is an action
prole a = (a
i
)
iN
in A such that for each player i in N, a
i
is a dominant
strategy of player i.
In example 1.1.3, (T, L) is an equilibrium in dominant strategies. In
the prisoners dilemma (example 1.1.10), (B, R) is an equilibrium in dom-
inant strategies. Even if it is socially preferable that the players exchange
their fruits, it is believed that rational players wont. Equilibria in dominant
strategies do not exist in the examples 1.1.4, 1.1.5, 1.1.6. Be careful that
there may exist several equilibria in dominant strategies, and even several
dominant strategy equilibrium payos.
2nd-price auctions are important examples where a dominant strategy
equilibrium exists (see exercise 6.1.3). In a general strategic interaction,
typically the best reply of a player does depend on the actions of the other
players. And dominant strategies seldom exist. The following concept of
Nash equilibria is fundamental.
1.3 Nash equilibrium
1.3.1 Denition and rst properties
Denition 1.3.1. A Nash equilibrium is an action prole such that each
player is in best reply against the strategies of the other players. Formally,
let a = (a
i
)
iN
be an action prole in A.
a is a Nash equilibrium of G if and only if:
i N, b
i
A
i
, g
i
(a
i
, a
i
) g
i
(b
i
, a
i
).
It is equivalent to: for all i in N, a
i
i
.
When a is a Nash equilibrium of G, the vector (g
i
(a))
iN
in R
n
is called
a Nash equilibrium payo of G.
If the other players play according to a
i
, then player i should play a
i
rather than any other action b
i
.
In other words, a Nash equilibrium is an action prole such that there
is no unilateral deviation (i.e. deviation by a single player) which is strictly
protable.
The simplest interpretation is that of a contract between the players.
Assume that the players agree, in one way or another, to play the action
prole a. If a player i thinks that the other players are going to respect the
contract, i.e. are going to play according to a
i
, then player i can not do
better than playing a
i
, i.e. than respecting the contract himself. With this
interpretation, a Nash equilibrium is nothing more than a stable contract.
We can also think of social norms or rules in force in our societies. In some
countries, people drive on the left. Any driver starting to drive on the right
of the road would increase his risk of accident and consequently decrease his
utility. Thus, everybody driving left corresponds to a Nash equilibrium.
Regarding the existence and the number of Nash equilibria in strate-
gic games, nothing can a priori be excluded. In the examples 1.1.3, 1.1.4,
1.1.6, 1.1.10, there is a unique Nash equilibrium, respectively: (T, L), (B, R),
(R, (r
1
, l
2
)), and (B, R). There exists no Nash equilibrium in example 1.1.5
(even if we will see later how to extend this game and get the existence of a
Nash equilibrium in mixed strategies). And it is not unusual to have the
existence of several Nash equilibria, as in the next example.
Example 1.3.2.
L R
T
B
_
(2, 1) (0, 0)
(0, 0) (1, 2)
_
This game is called the battle of the sexes. One may think of a couple
player 1, player 2, who would like rst and foremost to spend the evening
together. Each player has to choose between going Dancing (actions T and
L) or going to watch Boxing (actions B and R), and choices are supposed
to be simultaneous. Here, both (T, L) and (B, R) are Nash equilibria of the
game.
The notion of Nash equilibrium is weaker than the one of equilibrium in
dominant strategies, as the following proposition shows.
Proposition 1.3.3. An equilibrium in dominant strategies is a Nash equi-
librium.
Proof: Let a = (a
i
)
iN
be a Nash equilibrium in dominant strategies of G.
Fix a player i in N. Since a
i
is a dominant strategy of player i, it is a best
reply against any action prole of the other players, and in particular it is a
best reply against a
i
. Hence a is a Nash equilibrium of G.
The converse of proposition 1.3.3 is clearly false. However, we have the
following result.
Proposition 1.3.4. If a = (a
i
)
iN
is a Nash equilibrium, then for each i in
N the strategy a
i
is not strictly dominated.
Proof: Let a = (a
i
)
iN
be a Nash equilibrium of G, and let i be in N. The
strategy a
i
i
, and consequently can not be strictly
dominated.
The following example shows that a Nash equilibrium may consist of
weakly dominated strategies.
Example 1.3.5.
L R
T
B
_
(1, 1) (0, 0)
(0, 0) (0, 0)
_
(B, R) (as well as (T, L)) is a Nash equilibrium. However B is weakly domi-
nated by T, and R is weakly dominated by L.
1.3.2 Existence of a Nash equilibrium
We now look for conditions yielding the existence of a Nash equilibrium. Let
us start with a little bit of algebra and analysis.
Denition 1.3.6. Let X be a convex subset of a real vector space. A
mapping f from X to the reals is said to be quasi-concave if:
x X, y X, [0, 1], f(x + (1 )y) min{f(x), f(y)}.
Lemma 1.3.7. Let X be a convex subset of a real vector space, and f be a
mapping from X to the reals. f is quasi-concave if and only if for any real
, the set {x X, f(x) } is convex.
Proof:
Assume that f is quasi-concave. Fix in IR, and put A = {x X, f(x)
}. We will show that A is convex. Let x and y be in A, and be in
[0, 1]. Write z = x + (1 )y. Since f is quasi-concave, we have f(z)
min{f(x), f(y)}. Since x and y are in A, min{f(x), f(y)} . Consequently
f(z) , and z A. A is convex.
Conversely, assume that for each real , the set {x X, f(x) } is
convex. Consider x, y in X, and in [0, 1]. Put = min{f(x), f(y)} IR,
and A = {z X, f(z) }. We have f(x) and f(y) , so both x
and y are in A. By assumption A is convex, so x + (1 )y is in A. So
f(x + (1 )y) , and f is quasi-concave.
Recall that f is concave if and only if : x X, y X, [0, 1],
f(x + (1 )y) f(x) + (1 )f(y). Because f(x) + (1 )f(y)
min{f(x), f(y)} always holds for in [0, 1], a concave function always is
quasi-concave. The converse does not hold in general, and for example any
non-decreasing mapping from IR to IR is quasi-concave. So is any non-
increasing mapping from IR to IR. And the mapping (x x
2
), dened
over IR
+
, is both convex and quasi-concave.
We shall also use the notion of correspondence.
Denition 1.3.8. Given two sets X and Y , a correspondence (or multiap-
plication) from X to Y is a mapping from X to the set P(Y ) of subsets of
Y . We write :
F : X Y
x F(x) Y
to denote the correspondence associating to every x in X the subset F(x) of
Y .
The graph of the correspondence F is then dened as:
Graph F = {(x, y) X Y, y F(x)}.
And an element x in X is said to be a xed point of the correspondence F if
x F(x).
Notice that if f is a mapping from X to Y , we can dene the corre-
spondence F from X to Y which associates to each x of X the singleton
F(x) = {x}. Then, the graph of the correspondence F coincides with the
graph of the mapping f.
We will use, without proof, the following result. Recall that an Euclidean
space is simply a real vector space of nite dimension. In such a space, all
norms are equivalent, and compact subsets coincide with closed and bounded
subsets.
Theorem 1.3.9. (Kakutanis xed point theorem)
Let X be a convex and compact subset of an Euclidean space, and let
F : X X be a correspondence with compact graph such that for each x
in X, the set F(x) is convex compact and non empty.
Then F has a xed point, i.e. there exists x
in X such that x
F(x
).
Notice that if the graph of F is compact, then automatically we have
F(x) compact for each x in X. Hence the hypothesis F(x) compact for each
x could simply be removed from the statement of the theorem. The proof
goes beyond the scope of these notes, let us just mention a strong link with
Brouwers xed point theorem. If f is a mapping from X to X, dene the
correspondence F from X to itself associating to each x in X the singleton
{x}. Notice that a singleton is always non empty convex and compact. Hence
to apply Kakutanis theorem it is enough to satisfy the condition that the
graph of F (or f) is compact. Since X is compact, this graph is closed in
X X if and only if f is continuous. We then obtain the following famous
result.
Theorem 1.3.10. (Brouwers xed point theorem) Let X be a non empty
convex compact subset of an Euclidean space, and f be a continuous mapping
from X to X. Then f has a xed point, i.e. one can nd x
in X such that
f(x
) = x
.
Brouwers theorem can thus be proved using Kakutanis theorem. But
we have admitted Kakutanis result, and as a matter of fact it is usual in
mathematics to proceed with the order reversed, i.e. to rst prove Brouwers
theorem and then to use it to get Kakutanis theorem. We now come back
to game theory strictly speaking.
Fix a strategic game G = (N, (A
i
)
iN
, (g
i
)
iN
).
Denition 1.3.11.
1) For a player i in N and an element a
i
in A
i
, we denote by R
i
(a
i
)
the set of best replies of player i against a
i
. We have :
R
i
(a
i
) = {a
i
A
i
, g
i
(a
i
, a
i
) = max
b
i
A
i
g
i
(b
i
, a
i
)}.
with the convention: R
i
(a
i
) = if max
b
i
A
i g
i
(b
i
, a
i
) does not exist.
2) Fix i in N. We dene R
i
: A
i
A
i
.
a
i
R
i
(a
i
)
R
i
is called the best reply correspondence of player i.
3) The best reply correspondence of the game G is dened as:
R : A A
a = (a
i
)
iN
R(a) =
iN
R
i
(a
i
) = {(b
i
)
iN
A, i N b
i
R
i
(a
i
)}.
By denition, a Nash equilibrium of G is an action prole a = (a
i
)
iN
such that for each i in N, a
i
i
. As a consequence
we have the following result.
Lemma 1.3.12. A strategy prole is a Nash equilibrium of G if and only if
it is a xed point of the best reply correspondence of G.
We can now give sucient conditions for the existence of a Nash equilib-
rium.
Theorem 1.3.13. (Glicksbergs theorem)
Let G = (N, (A
i
)
iN
, (g
i
)
iN
) be a strategic game where N = {1, ..., n} is
a nite set. We assume that:
1) For each i in N, A
i
is a convex compact set of an Euclidean space.
2) For each i in N, g
i
is continuous.
3) For each i in N, for each a
i
in A
i
,
the mapping: A
i
IR is quasi-concave.
a
i
g
i
(a
i
, a
i
)
Then a Nash equilibrium of G exists.
Moreover, the set of Nash equilibria of G is closed in A. The set of Nash
equilibrium payos of G is a compact subset of IR
n
.
Proof:
The set of action proles A is by denition
iN
A
i
. Since each action
set A
i
is a subset of a real vector space with nite dimension, A also is a
subset of an Euclidean space. Moreover, each A
i
being non-empty, convex
and compact, so is A. We will apply Kakutanis theorem to the best reply
correspondence R.
Fix i in N and a
i
in A
i
. The set of best replies of player i against a
i
is:
R
i
(a
i
) = {a
i
A
i
, g
i
(a
i
, a
i
) = max
b
i
A
i
g
i
(b
i
, a
i
)}.
Since g
i
is continuous and A
i
is compact, R
i
(a
i
) is non-empty and closed
in A
i
. Notice that we can also write: R
i
(a
i
) = {a
i
A
i
, g
i
(a
i
, a
i
)
max
b
i
A
i g
i
(b
i
, a
i
)}. By the quasi-concavity assumption 3), and by lemma
1.3.7, R
i
(a
i
) is convex. So R
i
(a
i
) is non-empty, convex and compact.
For each a in A, R(a) =
iN
R
i
(a
i
) is thus non empty, convex and
compact. We now show that the graph of R is closed in A A.
Consider a sequence (a
t
, b
t
)
t0
with values in Graph R which converges
to a limit (a, b) in A A. Let us show that (a, b) Graph R.
For each t 0, write a
t
= (a
i
t
)
iN
and b
t
= (b
i
t
)
iN
. We have by denition
of (a
t
, b
t
) Graph R that:
i N, c
i
A
i
, g
i
(b
i
t
, a
i
t
) g
i
(c
i
, a
i
t
).
The mapping g
i
being continuous, we can go to the limit when t goes to
innity and obtain:
i N, c
i
A
i
, g
i
(b
i
, a
i
) g
i
(c
i
, a
i
).
This means that for each player i, b
i
i
. So b R(a),
and nally the graph of R is closed. Since AA is compact, the graph of R
is compact.
Consequently, we apply Kakutanis theorem to R, and we get the exis-
tence of an element a
of A such that a
R(a
). By lemma 1.3.12, a
is a
Nash equilibrium of G.
Denote now by NE the set of Nash equilibria of G. We have NE = {a
A, a R(a)} = {a A, (a, a) Graph R}. Since the graph of R is closed
in A A, and since the mapping from A to A A associating to each a
the couple (a, a) is continuous, we obtain that NE is closed in A, hence is
compact.
The set of Nash equilibrium payos of G is {g(a), a NE} = g(NE).
For each player i, g
i
is continuous hence so is g. The set NE being compact,
g(NE) is a continuous image of a compact set, hence also is a compact set.
The set of Nash equilibrium payos of G is thus compact.
1.3.3 Iterated elimination of strictly dominated strate-
gies
We now show that, when computing Nash equilibria, it is always possible to
rst remove a strategy which is strictly dominated.
Fix a player i and a strictly dominated strategy b
i
of this player. By
proposition 1.3.4, b
i
is not played in a Nash equilibrium. Consider now the
game G
obtained from G by removing the strategy b

i
. To be precise, G
is the game (N, (A

j
)
jN
, (g
j
)
jN
), where: A
i
= A
i
\{b
i
}, for each player j
dierent from i we have A
j
= A
j
, and for each action prole a
in A
, we
simply have g
j
(a
) = g
j
(a
) for every j in N.
Proposition 1.3.14. Let G = (N, (A
i
)
iN
, (g
i
)
iN
) be a strategic form
game, i be a player in N and b
i
A
i
be a strictly dominated strategy of
player i. Let G
be the game obtained from G when the strategy b

i
has been
removed.
Then the Nash equilibria of G and G
coincide.
Proof:
1) Let a = (a
j
)
jN
be a NE of G. By proposition 1.3.4, a
i
is not strictly
dominated, so a
i
= b
i
. The action prole a can be played in G
. For each
player j in N, a
j
j
in G, and since all actions
available for j in G
are also available in G, hence a

j
also is a best reply
against a
j
in G
. a is a Nash equilibrium of G
.
2) Let a = (a
j
)
jN
be a Nash equilibrium of G
. a is playable in G. For
each player j in N\{i}, a
j
j
in G
, and A
j
= A
j
,
so a
j
j
in G. We now show that a
i
is a best
reply against a
i
in G. We have : c
i
A
i
\{b
i
}, g
i
(a
i
, a
i
) g
i
(c
i
, a
i
).
In addition b
i
is strictly dominated in G, so there exists c
i
in A
i
\{b
i
} such
that: g
i
(b
i
, a
i
) < g
i
(c
i
, a
i
) g
i
(a
i
, a
i
). Consequently we have: c
i
A
i
,
g
i
(a
i
, a
i
) g
i
(c
i
, a
i
), and a
i
i
in G. a is a Nash
equilibrium of G.
Notice that part 2) of the proof applies as soon as b
i
is a weakly dominated
strategy of player i. As a consequence, when removing a weakly dominated
strategy, the set of Nash equilibria can only decrease: the new set of NE is
included in the previous one. Notice also that proposition 1.3.14 only applies
while removing a (or any nite number of) strictly dominated strategies:
be careful that one may create new NE by removing an innite number of
strictly dominated strategies. Find an example.
Proposition 1.3.14 is sometimes very useful while computing the set of
NE of a particular game. One can eliminate a nite number of strictly dom-
inated strategies and obtain a game G
. Then some strategies may become

strictly dominated in G
, and one can start again and eliminate a nite num-

ber of them. A new game is obtained and again, some strategies may become
strictly dominated etc. We can continue until no strategy is strictly domi-
nated, and if it happens that in the remaining game all strategy proles yield
the same vector payo, we say that the initial game G is solvable via iterated
elimination of strictly dominated strategies.
Remark 1.3.15. The idea of iterated elimination of strictly dominated strate-
gies can be thought as follows. It is rational to think that a rational player
is not going to play a strictly dominated strategy. If all players are rational
and think that the other players are rational, all players are going to con-
sider the game G
. If all players think that all players think that all players
are rational, we can continue: all players know that G
is considered, so
a strictly dominated strategy in G
may be removed etc. At each step, by

proposition 1.3.14 the set of Nash equilibria remains the same.
Example 1.3.16. Let G be the following 2-player game:
l r
T
M
B
_
_
(3, 0) (2, 1)
(0, 0) (3, 1)
(1, 1) (1, 0)
_
_
.
Strategy B is strictly dominated by T for player 1. The game G
obtained
from G by elimination of B, is given by:
l r
T
M
_
(3, 0) (2, 1)
(0, 0) (3, 1)
_
.
Now, l is strictly dominated by r for player 2 in G
. One can remove g

and obtain the game G
represented by:
d
T
M
_
(2, 1)
(3, 1)
_
.
But T is now strictly dominated by M, hence we nally obtain :
r
M
_
(3, 1)
_
The unique Nash equilibrium of G is the action prole (M, r).
The game of exercise 6.1.13 has lots of strictly dominated strategies.
1.4 Mixed strategies
We come back to example 1.1.5 (Matching Pennies):
L R
T
B
_
(1, 1) (1, 1)
(1, 1) (1, 1)
_
.
Formally no Nash equilibrium exists here. Imagine, however, that you have
to write a computer program for this game (as player 1), and that this pro-
gram should be run on a website where everyone can come and play for real,
maybe several times, the payo being (plus or minus) one euro for each game.
How should you write such program ? A program always playing T (or B)
would soon lose a lot of money. It is clear here that there exists an appropriate
way to proceed, which is to write a program selecting independently at each
run to play T or B with probability 1/2. Even if the users know that you have
written such a program, they can not prot from this and have a positive
expectation of gain: whether they play L or R, the expected payo is 0. (and
it is now enough for you to put a 5 cents fee for every play to quickly make
a prot.)
But how would you play in the following game ?
L R
T
B
_
(2, 2) (1, 1)
(1, 1) (1, 1)
_
1.4.1 Finite games and mixed strategies
We consider here a nite strategic game G = (N, (A
i
)
iN
, (g
i
)
iN
). To sim-
plify notations, we will assume that N = {1, ..., n}, where n is the number
of players. We will dene an extended game

G, obtained from G by allowing
the players to select their action randomly and considering expected payos.
Notation 1.4.1. Given a nite set S, we denote by (S) the set of probabil-
ities over S (endowed with the -algebra of all subsets of S). A probability
on S will be written x = (x(s))
sS
, with x(s) 0 for each s in S and
sS
x(s) = 1. We have:
(S) = {x = (x(s))
sS
IR
S
, s S x(s) 0 and
sS
x(s) = 1}.
Given x in (S), x(s) is the probability of s under x, i.e. the probability

that the lottery x selects s. S being nite, IR
S
is an Euclidean space. For x
in (S), we have x(s) [0, 1] for each s in S. Hence (S) is bounded. It
is also a closed set of IR
S
, hence (S) is a compact set. If x and y are in
(S), and is in [0, 1], one denes the convex combination : x+(1)y =
(z(s))
sS
(S), with for each s, z(s) = x(s) + (1 )y(s). Hence (S)
is a convex subset of IR
S
.
Denition 1.4.2. Let G = (N, (A
i
)
iN
, (g
i
)
iN
) be a nite strategic game,
with N = {1, ..., n}. The mixed extension of G is the strategic game

G =
(N, ((A
i
))
iN
, ( g
i
)
iN
), where for each player i in N the payo function g
i
is dened by : (x
1
, ..., x
n
)
jN
(A
j
),
g
i
(x
1
, ..., x
n
) = IE
x
1
...x
n(g
i
) =
(a
1
,...,a
n
)A
x
1
(a
1
)...x
n
(a
n
) g
i
(a
1
, ..., a
n
).
In

G, each player can play a lottery, or probability over actions. If each
player i in N plays the lottery x
i
on A
i
, this denes a probability x
1
x
2
...x
n
over A. x
1
x
2
...x
n
is called the (direct) product of the probabilities
(x
i
)
iN
, meaning that the lotteries used by the players are independent. And
the players want to maximize their expected payo (to some extent, this can
be justied via the notion of von Neumann-Morgenstern utility functions).
As an example, suppose that in Matching Pennies player 1 chooses the
probability (x
1
(T), x
1
(B)) = (2/3, 1/3) whereas player 2 chooses the prob-
ability (x
2
(L), x
2
(R)) = (2/5, 3/5). This induces the product probability
x
1
x
2
over the entries of the matrix: (T, L) has probability (2/3)(2/5)=4/15,
(T, R) has probability (2/3)(3/5)=6/15, (B, G) has probability (1/3)(2/5)=2/15,
and (B, D) has probability (1/3)(3/5)=3/15. And the expected payo are
g
1
(x
1
, x
2
) = (4/15)1 + (6/15)(1) + (2/15)1 + (3/15)(1) = 1/5, and
g
2
(x
1
, x
2
) = +1/5.
Vocabulary: For i in N, an element of (A
i
) is called a mixed strategy of
player i in G.
For a
i
in A
i
, the action a
i
is identied with the probability giving weight
1 to action a
i
. So we identify a
i
and the Dirac measure:
a
i = (x
i
(b
i
))
b
i
A
i
in (A
i
), where x
i
(a
i
) = 1 and for each b
i
= a
i
, x
i
(b
i
) = 0 (this allows to
denote 2/3 T + 1/3 B the mixed strategy playing T with probability 2/3
and B with probability 1/3). In contrast with other probabilities, a
i
A
i
is called a pure strategy of player i. A pure strategy simply is a particular
case of mixed strategy. If a = (a
i
)
iN
is a pure strategy prole in A, we
fortunately have g
i
(a) = g
i
(a) for each i.
Notice that each mixed strategy x
i
= (x
i
(a
i
))
a
i
A
i can now be written:
x
i
=
a
i
A
i
x
i
(a
i
)a
i
. Since x
i
(a
i
) 0 for each a
i
and
a
i
A
i
x
i
(a
i
) = 1, the
mixed strategy x
i
is a convex combination of pure strategies of player i. The
set (A
i
) of mixed strategies of player i is the convex hull of the set A
i
of
pure strategies of player i, i.e. (A
i
) is the smallest subset of IR
A
i
which is
convex and contains all elements of A
i
.
The following elementary formula, obtained by conditioning the payo of
player i with the random variable of his own action in A
i
, will be helpful
later.
Lemma 1.4.3. Let i be a player in N, and let (x
1
, ..., x
n
) in (A
1
) ...
(A
n
) be a mixed strategy prole. We have :
g
i
(x) =
a
i
A
i
x
i
(a
i
) g
i
(a
i
, x
i
).

Proof:
g
i
(x) =
(a
1
,...,a
n
)A
x
1
(a
1
)...x
n
(a
n
) g
i
(a
1
, ..., a
n
)
=
a
i
A
i
x
i
(a
i
)
a
i
A
i
x
1
(a
1
)...x
i1
(a
i1
)x
i+1
(a
i+1
)...x
n
(a
n
) g
i
(a
1
, ..., a
n
)
=
a
i
A
i
x
i
(a
i
) g
i
(a
i
, x
i
)
Notice that if a = (a
i
)
iN
is a pure strategy prole in A, we have g
i
(a) =
g
i
(a) for each i.
Lemma 1.4.4. A Nash equilibrium in pure strategies remains a Nash equi-
librium in mixed strategies.
Formally, if a A is a Nash equilibrium of G, then the element a, seen
as a mixed strategy prole, is a Nash equilibrium of

G.
Proof: Let a = (a
1
, ..., a
n
) be a Nash equilibrium of G. Fix i in N and
x
i
= (x
i
(b
i
))
b
i
A
i in (A
i
). We have: g
i
(x
i
, a
i
) =
b
i
A
i
x
i
(b
i
)g
i
(b
i
, a
i
)
b
i
A
i
x
i
(b
i
)g
i
(a
i
, a
i
) = 1g
i
(a) = g
i
(a). So a
i
is a best reply of player i
against a
i
in

G. This being true for each player i, a is a Nash equilibrium
of

G.
A Nash equilibrium in mixed strategies is often called a mixed Nash
equilibrium. The following result is fundamental.
Theorem 1.4.5. (Nashs theorem) In a nite game, there exists a Nash
equilibrium in mixed strategies. Moreover, the set of mixed Nash equilibria,
as well as the set of mixed Nash equilibrium payos, are compact.
Proof: We will use theorem 1.3.13.
Let i be a player in N. (A
i
) is convex compact non empty in the
Euclidean space IR
A
i
. The mapping g
i
is multilinear, hence is continuous.
Fix x
i
in
j=i
(A
j
), and denote by h the mapping from (A
i
) to IR such
that:
x
i
(A
i
), h(x
i
) = g
i
(x
1
, ..., x
i
, ..., x
n
) = g
i
(x
i
, x
i
).
h associates, to each probability x
i
on A
i
, the payo of player i if he plays
x
i
against x
i
. By lemma 1.4.3, we have h(x
i
) =
a
i
A
i
x
i
(a
i
)h(a
i
), so h is
ane hence concave and a fortiori quasi-concave.
Glicksbergs theorem consequently applies to

G and the results easily
follow.
The next proposition will be used in practice to compute mixed Nash
equilibria.
Proposition 1.4.6. Characterization of mixed strategies Nash equilibria
Let G = (N, (A
i
)
iN
, (g
i
)
iN
) be a nite game, and x = (x
i
)
iN
in
iN
(A
i
) be a mixed strategy prole. We have the following equivalence:
x is a mixed Nash equilibrium of G
_
i N, a
i
A
i
s.t. x
i
(a
i
) > 0, g
i
(a
i
, x
i
) = max
b
i
A
i
g
i
(b
i
, x
i
)
_
.
In words, x is a Nash equilibrium of

G if and only if: each pure strategy
played with positive probability is a best reply against the strategies of the
other players.
Proof: Fix a player i. Assume that the other players play according to x
i
,
we study the best replies of player i against x
i
. As in the proof of Nashs
theorem, denote by h the mapping from (A
i
) to IR such that:
y
i
(A
i
), h(y
i
) = g
i
(y
i
, x
i
) =
a
i
A
i
y
i
(a
i
)h(a
i
).
h is ane, and (A
i
) is the convex hull of a nite number of points. Let
us rst show that h attains its maximum at a point in A
i
. Put =
max
a
i
A
i h(a
i
). is the best payo which can be obtained by player i while us-
ing pure strategies against x
i
. We have: y
i
(A
i
), h(y
i
) =
a
i
A
i
y
i
(a
i
)h(a
i
)
a
i
A
i
y
i
(a
i
) = . Thus against x
i
, player i can not obtain a payo greater
than even if he uses mixed strategies.So = max
y
i
(A
i
)
h(y
i
). Against
x
i
, player i always has a best reply in pure strategies.
Moreover, let y
i
be a mixed strategy of player i. We have:
y
i
is a best reply against x
i
g
i
(y
i
, x
i
) = ,
a
i
A
i
y
i
(a
i
) g
i
(a
i
, x
i
) = ,
a
i
A
i
y
i
(a
i
)
_
g
i
(a
i
, x
i
)
_
= 0.
In addition we have for each a
i
in A
i
: y
i
(a
i
) 0 and g
i
(a
i
, x
i
) 0,
hence the product y
i
(a
i
) ( g
i
(a
i
, x
i
) ) is non positive. Since a sum of non
positive terms is 0 if and only if all terms are 0, we get:
y
i
i
a
i
A
i
, y
i
(a
i
)
_
g
i
(a
i
, x
i
)
_
= 0,
a
i
A
i
s.t. y
i
(a
i
) > 0, g
i
(a
i
, x
i
) = .
We nally obtain that y
i
i
if and only if y
i
is a
convex combination of the pure strategies being best replies against x
i
.
Since x is a Nash equilibrium if and only if x
i
i
for each i, the proposition is proved.
Remark 1.4.7. On the interpretation of mixed Nash equilibria.
The probabilities not necessarily represent lotteries used by the players be-
fore choosing their pure action. A mixed strategy Nash equilibrium (x
1
, ..., x
n
)
can also represent a situation where:
- each player i has a belief over the pure strategies to be employed by the
other players. More precisely, each player i estimates the action to be played
by a player j = i with the probability x
j
. Hence here x
j
is seen as the belief,
or subjective probability, of the other players over the action to be played by
player j. It is important to notice that all players i dierent from j share
the same belief x
j
over the action of player j (this is not always justied in
practice).
- each player i plays (randomly or not, in any manner he likes...) a pure
strategy a
i
in A
i
which is optimal given the belief of player i, i.e. which is a
best reply against x
i
(see proposition 1.4.6).
1.4.2 Elimination of strategies strictly dominated by a
mixed strategy
We come back to the elimination of strictly dominated strategies. Fix a nite
game G as before, and assume that player i has a pure strategy b
i
which is
strictly dominated in G, or more generally in the mixed extension

G of G (see
example 1.4.9 later). That is, there exists a mixed strategy z
i
in (A
i
) which
strictly dominates b
i
, i.e. which satises: x
i

j=i
(A
j
), g
i
(z
i
, x
i
) >
g
i
(b
i
, x
i
).
We now show that any mixed strategy x
i
in (A
i
) playing b
i
with pos-
itive probability is also strictly dominated in

G. Consider x
i
in (A
i
) such
that x
i
(b
i
) > 0. The idea is to report the weight assigned by x
i
to b
i
over
the strategy z
i
. We precisely dene y
i
= (y
i
(a
i
))
a
i
A
i by: a
i
A
i
\{b
i
},
y
i
(a
i
) = x
i
(a
i
) + x
i
(b
i
)z
i
(a
i
) and y
i
(b
i
) = x
i
(b
i
)z
i
(b
i
). All coordinates of y
i
are positive and they sum to 1, so y
i
(A
i
). Consider now an element x
i
in
j=i
(A
j
). We have:
g
i
(y
i
, x
i
) =
_
a
i
A
i
\{b
i
}
y
i
(a
i
) g
i
(a
i
, x
i
)
_
+ y
i
(b
i
) g
i
(b
i
, x
i
)
=
_
a
i
A
i
\{b
i
}
x
i
(a
i
) g
i
(a
i
, x
i
)
_
+
_
a
i
A
i
\{b
i
}
x
i
(b
i
)z
i
(a
i
) g
i
(a
i
, x
i
)
_
+x
i
(b
i
)z
i
(b
i
) g
i
(b
i
, x
i
),
=
_
a
i
A
i
\{b
i
}
x
i
(a
i
) g
i
(a
i
, x
i
)
_
+ x
i
(b
i
)
_
a
i
A
i
z
i
(a
i
) g
i
(a
i
, x
i
)
_
,
=
_
a
i
A
i
\{b
i
}
x
i
(a
i
) g
i
(a
i
, x
i
)
_
+ x
i
(b
i
) g
i
(z
i
, x
i
).
Since x
i
(b
i
) > 0 and g
i
(z
i
, x
i
) > g
i
(b
i
, x
i
), we have:
g
i
(y
i
, x
i
) >
_
a
i
A
i
\{b
i
}
x
i
(a
i
) g
i
(a
i
, x
i
)
_
+ x
i
(b
i
) g
i
(b
i
, x
i
) = g
i
(x
i
, x
i
).
Hence x
i
is strictly dominated by y
i
. By proposition 1.3.14, we can elim-
inate x
i
to compute the Nash equilibria of

G.
Imagine we want to compute the mixed Nash equilibria of G. A slight
modication of the proof of proposition 1.3.14 shows that we could start by
removing all mixed strategies playing b
i
with positive probability, hence we
can simply remove the pure action b
i
from the game G, and then consider
the mixed Nash equilibria of the restricted game. We have then obtained the
following result.
Proposition 1.4.8. Let G = (N, (A
i
)
iN
, (g
i
)
iN
) be a nite strategic game.
Let i be in N and b
i
A
i
be a pure strategy of player i which is strictly
dominated in the mixed extension of G. Let G
be the game issued from G

where we have removed strategy b
i
.
Then the sets of mixed Nash equilibrium of G and G
coincide.
If b
i
is strictly dominated in G, there exists a
i
in A
i
s.t. : a
i
j=i
A
j
, g
i
(a
i
, a
i
) > g
i
(b
i
, a
i
). This implies, taking expectations, that:
x
i

j=i
(A
j
), g
i
(a
i
, x
i
) > g
i
(b
i
, x
i
). And b
i
remains strictly domi-
nated in

G. The converse does not hold in general, as the following example
shows.
Example 1.4.9.
l r
T
M
B
_
_
(3, 0) (0, 1)
(0, 0) (3, 1)
(1, 1) (1, 0)
_
_
In G, no strategy of player 1 is strictly dominated. But in the mixed
extension

G, the pure strategy B of player 1 is strictly dominated by the
mixed strategy
1
2
T +
1
2
M (whatever plays player 2, this strategy gives to
player 1 an expected payo of 3/2 which is greater than 1, the payo induced
by B).
With the previous argument, to compute the mixed Nash equilibrium of
G it is possible to rst eliminate B. We are left with the remaining game:
l r
T
M
_
(3, 0) (0, 1)
(0, 0) (3, 1)
_
. In this new game, l is strictly dominated, hence
we remove it and we easily obtain: (M, r) is the unique Nash equilibrium (in
pure and in mixed strategies) of the initial game.
1.4.3 Generalization of the mixed extension
When the pure action sets are no longer nite (e.g., the interval [0, 1], or the
set IN of non negative integers), it is also interesting to introduce the notion
of mixed strategies. Let G = (N, (A
i
)
iN
, (g
i
)
iN
) be a strategic game, with
N = {1, ..., n}, and we assume moreover that:
- each action set A
i
is endowed with a -algebra A
i
.
- for each player i, the mapping g
i
: A IR is measurable with re-
spect to the product -algebra
iN
A
i
(i.e. for each real number t, the set
{a A, g
i
(a) t} is in
iN
A
i
) and bounded.
See for example exercise 6.1.24 (all-pay auctions).
The mixed extension of G can be dened as the game :
G = (N, ((A
i
))
iN
, ( g
i
)
iN
) where:
- for each player i, (A
i
) is the set of probabilities over (A
i
, A
i
),
- for each player i in N, for each (x
1
, ..., x
n
) in (A
1
) ... (A
n
), we
have :
g
i
(x
1
, ..., x
n
) =
_
A
g
i
(a
1
, ..., a
n
)dx
1
(a
1
)...dx
n
(a
n
).
Notice that the above integral exists since g
i
is bounded and we integrate
with respect to a probability measure:
_
A
|g
i
(a
1
, ..., a
n
)|dx
1
(a
1
)...dx
n
(a
n
)
(sup
aA
|g
i
(a)|)
_
A
dx
1
(a
1
)...dx
n
(a
n
) = sup
aA
|g
i
(a)| < +.
Nashs theorem 1.4.5 can be extended as follows:
Theorem 1.4.10. (Nash-Glicksberg) Assume that for each player i in {1, ..., n},
the strategy set A
i
is compact metric and the payo function g
i
: A IR is
continuous. Then a mixed Nash equilibrium exists.
For an example of a game with all-pay auctions (exercise 6.1.24).
1.5 Rationalizability
The concept of rationalizable strategies has some similarities, but is dierent
from the iterated elimination of strictly dominated strategies. A strategy
which is never a best reply can not be played at a Nash equilibrium. Since
each player can infer this, it makes sense to look at the game where such
strategies have been removed. We can then iterate and remove the strategies
which are never best replies in this game, etc. What is left at the end ?
We start with G = (N, (A
i
)
iN
, (g
i
)
iN
), where N = {1, ..., n}. For each
player i, we denote by P(A
i
) the set of subsets of A
i
, and for each B =
B
1
B
2
... B
n
in
iN
P(A
i
), we dene:
F
i
(B) = {a
i
B
i
, a
i
B
i
, a
i
i
in G}.
F(B) = F
1
(B) F
2
(B) .... F
n
(B).
Since F
i
(B) B
i
for each i, we have F(B) B. We can now formally dene
rationalizable strategies.
Denition 1.5.1. Put A
0
= A and A
k+1
= F(A
k
) for each k 1. The set
of rationalizable strategies of G is dened as R =
k0
A
k
.
Notice that (A
k
) is a weakly decreasing sequence, i.e. A
k+1
A
k
for
each k. And clearly, any Nash equilibrium of G will belong to the set of
rationalizable strategies.
Theorem 1.5.2. (Bernheim 1984, Pearce 1984) Assume that for each player
i the set A
i
is metric compact and the payo function g
i
is continuous (the
game G is said to be compact). Then the set R of rationalizable strategies is
non empty and compact, and we have R = F(R). Moreover, R is the largest
set L
iN
P(A
i
) s.t. L = F(L).
Proof:
1) It is rst easy to check that F(B) is compact whenever B is, hence for
each k the set A
k
is compact.
2) We have A
k+1
A
k
for each k. Assume for the sake of contradiction
that A
k+1
= for a rst integer k. We have A
k
= , so there exists a =
(a
1
, ..., a
n
) A
k
and a A
0
, A
1
, ..., A
k
. Let a
= (a
1
, ..., a
n
) be a component
wise best reply against a in the game G. We have a
A
0
, A
1
, ..., A
k
, A
k+1
hence A
k+1
= . This is a contradiction, hence for each k we have A
k
= .
We obtain that R is compact and non empty as a decreasing limit of non
empty compact sets.
3) We prove that R F(R). Let a = (a
1
, ..., a
n
) R =
k0
A
k
. For each
k there exists b
k
(1) A
1
k
,..., b
k
(n) A
n
k
such that for each player i, a
i
is
a best reply against b
k
(i) in the game G. Taking converging subsequences,
we have a limit point b(i) for each (b
k
(i))
k
, and a
i
is a best reply against b(i)
in the game G. We have (a
i
, b
k
(i)) A
k
for each k hence (a
i
, b(i)) R. We
obtain that a
i
F
i
(R), hence a F(R) and we have proved R F(R).
F(R) R being clear, we get R = F(R).
4) Let L in
iN
P(A
i
) s.t. L = F(L). We have L A
0
, hence L =
F(L) F(A
0
) = A
1
. Iterating, we get L A
k
for each k hence L R.
We now conclude chapter 1 by looking at a particular class of strategic
games.
1.6 Zero-sum games
Zero-sum games represent the strategic interactions with 2 players having
opposite interests. The conict is total (games of war), no cooperation is
possible.
1.6.1 Denition, value and optimal strategies
In the next denition, the word game refers to a strategic game.
Denition 1.6.1. A zero-sum game is a 2-player game such that the sum
of the payo functions is zero.
Matching pennies (example 1.1.5) is a zero-sum game. To simplify nota-
tions, a zero-sum game is represented by a triplet (I, J, g) where:
- I is the action set of player 1,
- J is the action set of player 2,
- and g : I J IR is the payo function of player 1.
We recall the interpretation of such a game. Simultaneously, player 1
chooses i in I and player 2 chooses j in J. Then the payo to player 1 is
g(i, j), and the payo to player 2 is g(i, j) (or equivalently, player 2 pays
the amount g(i, j) to player 1). All this description is known by the players.
We represent a nite zero-sum game with a matrix where player 1 chooses
a line, player 2 chooses a column and the entries of the matrix are the payo
for player 1. For example, Matching Pennies can be represented by the
matrix:
_
1 1
1 1
_
.
Conversely, any real matrix can be seen as a zero-sum game.
Fix in the sequel a zero-sum game G = (I, J, g). Let us forget for a
moment what we have learned so far, and think about the signication of
playing well in a zero-sum game G. On the one hand, player 1 wants to
maximize the payo function g, but this function depends on 2 variables i
and j, and player 1 only controls the variable i and not the variable j. On
the other hand, player 2 wants to minimize g, and controls j but not i.
In the sequel we will use the standard notation IR = IR{, +} (let
us mention that it would be possible to dene the mapping g with values in
IR, here we simply assume that g takes its values in IR).
Denition 1.6.2. Let x be in IR. We say that player 1 guarantees (or can
force) x in G if he can play in a way such that his payo will be at least x,
i.e. if there exists i in I such that : j J, g(i, j) x.
Player 1 always guarantees , and never guarantees +. Consider
now i in I. If player 1 plays action i, he is sure to obtain, whatever plays
player 2, a payo not lower than: inf
jJ
g(i, j) IR {} (recall that
inf
jJ
g(i, j) IR if the set {g(i, j), j J} is bounded from below, and we
have inf
jJ
g(i, j) = otherwise). Hence P1 guarantees inf
jJ
g(i, j).
Denition 1.6.3. We put:
= sup
iI
inf
jJ
g(i, j) IR.
In Matching Pennies, for any action of player 1 player 2 has a reply
yielding a payo of -1 for player 1, hence in this game = 1. Consider
now the game:
_
1 1
2 0
_
. If player 1 plays Top, his payo will be -1 at
worst, and if he plays Bottom his payo will be -2 at least. Hence here also
we have = 1.
In general, if < + then for each > 0, player 1 guarantees .
(If = +, then for each real x player 1 can play in order to get a payo
of at least x, and life is easy for player 1; on the contrary if = , then
we have inf
jJ
g(i, j) = for each i in I, and player 1 guarantees nothing).
We now consider player 2.
Denition 1.6.4. Let x be in IR. We say that player 2 guarantees (or can
force) x in G if he can play in a way such that player 1payo will be at most
x, i.e. if there exists j in J such that : i I, g(i, j) x.
And we dene
= inf
jJ
sup
iI
g(i, j) IR.
Player 2 can always guarantee +, and never guarantees . By play-
ing an action j in J, player 2 is sure that the payo to player 1 will be at
most sup
iI
g(i, j) (and his own payo will be at least sup
iI
g(i, j)), hence
player 2 guarantees sup
iI
g(i, j). If > , then for each > 0 player
2 guarantees + . In Matching Pennies, we have = 1. In the game
_
1 1
2 0
_
, we have = 0.
Lemma 1.6.5.
.
Proof: For any i in I and j in J we have: g(i, j) inf
j
J
g(i, j
). Taking
the supremum over i from each side, we obtain that for each j in J, we have:
sup
iI
g(i, j) . Taking now the inmum in j, we get: .
Remark 1.6.6. In this remark we consider modications on the timing of
the game.
Consider a rst variant of the game where: player 1 rst chooses i in I,
announces i to player 2 then player 2 chooses j in J and pays g(i, j) to player
1. In this new game, it is natural that the rational issue of the interaction
corresponds to . Consider now a second variant of the game where on the
contrary, player 2 plays rst: he chooses and announces j in J to player 1,
then player 1 chooses i in I and has payo g(i, j). Then the rational issue
of this second variant naturally is .
Lemma 1.6.5 states that player 1 will prefer the second variant, and that
player 2 will prefer the rst variant. In a zero-sum game, a player dislikes
to move rst, which is quite intuitive. In the sequel we come back to the
interpretation where players play simultaneously.
The following denition introduces the main solution concept for zero-
sum games.
Denition 1.6.7. We say that a zero-sum game G = (I, J, g) has a value if
= , i.e. if :
sup
iI
inf
jJ
g(i, j) = inf
jJ
sup
iI
g(i, j).
In this case, the quantity = is called the value of the game G, and is
often denoted by v = = .
Assume that G has a value v. Then v represents the rational issue of the
game, in the sense of the fair amount that player 1 should pay to player 2 to
get the right to play the game. And we have, if v belongs to the reals:
> 0, i I, j J, g(i, j) v ,
> 0, j J, i I, g(i, j) v + .
Denition 1.6.8. Assume that G has a value v.
A strategy i in I satisfying: j J, g(i, j) v is called an optimal
strategy of player 1 in G.
A strategy j in J satisfying: i I, g(i, j) v is called an optimal
strategy of player 2 in G.
Example 1.6.9. Consider the zero-sum game G = (IN, IN, g), where for each
couple (i, j) in IN IN, g(i, j) = 1/(j +1). The payo here does not depend
on the action played by player 1. We have = 0 = , hence the game has
a value which is 0. All strategies of player 1 are optimal here, and player 2
has no optimal strategy.
Notice that it is not excluded that the game has a value v which is +
or . v = + means that for each real number M, player 1 guarantees
M, i.e.: M IR, i I, j J, g(i, j) M. Similarly, v = means:
M IR, j J, i I, g(i, j) M. In these cases where |v| = , no player
has an optimal strategy (since we assumed that g has real values). When G
is nite, or is the mixed extension of a nite game, then the mapping g is
bounded and it is not possible to have |v| = +.
Notation 1.6.10. We denote by O
1
(resp. O
2
) the set of optimal strategies
of player 1 (resp. player 2) in G. If G has no value, then no player has an
optimal strategy, and O
1
= O
2
= .
O
1
I and O
2
J. The idea being that if G has a value and both players
have optimal strategies in G, playing well for a player means playing an
optimal strategy.
1.6.2 Links with Nash equilibria and the MinMax the-
orem
A zero-sum game G = (I, J, g) is also a particular case of a strategic game.
Consequently we can also consider the notions previously dened for gen-
eral strategic games (often called in contrast general sum games or non
zero-sum games, with the strange consequence that a zero-sum game is a
particular case of a non zero-sum game). Denote by EN the subset of I J
whose elements are the Nash equilibria of G. The illuminating link between
Nash equilibria and optimal strategies in a zero-sum game is the following.
Proposition 1.6.11.
O
1
O
2
= EN.
Moreover, the 3 following assertions are equivalent.
1) There exists a Nash equilibrium in G.
2) G has a value, and each player has an optimal strategy.
3) There exists a real v such that: (i I, j J, g(i, j) v) and
(j J, i I, g(i, j) v).
If these conditions are satised, v is the value of G and the set of Nash
equilibrium payos of G is restricted to the singleton {v, v}.
Proof:
Let (i, j) be in O
1
O
2
. Then O
1
O
2
is not empty, so the game has
a value v, and we have: i
I, g(i
, j) v g(i, j
) j
J. Thus
g(i, j) v g(i, j), and g(i, j) = v. We get : i
I, g(i
, j) g(i, j)
g(i, j
) j
J, and this means that (i, j) is a Nash equilibrium of G.

Conversely, let (i, j) be a Nash equilibrium of G. We have: i
I, j

J, g(i
, j) g(i, j) g(i, j
). So sup
i
I
g(i
, j) = g(i, j) = inf
j
J
g(i, j
).
Since sup
i
I
g(i
, j) and inf
j
J
g(i, j
) , we obtain g(i, j) .
Since by lemma 1.6.5 one always have , we obtain = = g(i, j), G
has a value which is g(i, j), and (i, j) is in O
1
O
2
.
This implies the proof of O
1
O
2
= EN. Condition 1) means EN = ,
and condition 2) means O
1
O
2
= , hence 1) and 2) are equivalent. 2)
implies 3) is clear. If 3) is satised, we easily show that the couple (i, j)
obtained by the quantifyers is a Nash equilibrium of G, hence 3) implies
1). 1), 2) and 3) are thus equivalent.
If the conditions are satised, there exists a Nash equilibrium payo, and
for each Nash equilibrium (i, j) we have g(i, j) = v, hence the payment of
a Nash equilibrium is v for player 1 and v for player 2, that is the couple
(v, v).
Recall that every nite game admits a Nash equilibrium in mixed strate-
gies. This implies the following result, and as a consequence one can associate
to each real matrix a real number called the value of the matrix.
Theorem 1.6.12. Minmax theorem of Von Neumann
Let G = (I, J, g) be a nite zero-sum game. Then the mixed extension of
G has a value, and both players have optimal strategies. We have :
min
y(J)
max
x(I)
g(x, y) = max
x(I)
min
y(J)
g(x, y).
Proof: By the theorem 1.4.5 of Nash, the mixed extension

G of G has a
Nash equilibrium.

G being a zero-sum game, the previous proposition states
that it has a value and both players have optimal strategies. In particular
we have:
inf
y(J)
sup
x(I)
g(x, y) = sup
x(I)
inf
y(J)
g(x, y).
Since g is continuous and (I) and (J) are compact, the suprema are
maxima and the inma are minima.
Historically, the Minmax theorem has been proved before Nashs theo-
rem, and it is not dicult to write a direct proof of it using a separation
theorem rather than a xed point theorem such as Brouwer or Kakutanis
theorem.
Another interpretation for mixed optimal strategies. Let G = (I, J, g)
be a matrix game, with I = {1, ..., n} and J = {1, ..., p}. Assume that each
row i corresponds to an asset of price 1 today. The state of the world to-
morrow is unknown, each element of J representing one state. If the state
tomorrow is j, the asset i will be worth g(i, j) tomorrow. A decision-maker
wants to invest M > 0 euros and to do so he will construct a portfolio
x = (x(i))
iI
in (I), where for each i x(i) represents the proportion of asset
i in the portfolio. If the state tomorrow is j, the value of the portfolio x to-
morrow will thus be
i
x(i)Mg(i, j). To say that x is an optimal strategy of
player 1 in the game G is equivalent to saying that the decision maker choos-
ing x is maximizing the value of its portfolio tomorrow in the worst case. An
example is given in exercise 6.1.22
To conclude this section we give an example of a zero-sum game without
a value (and allowing for mixed strategies would not help).
Example 1.6.13. Let G = (I, J, g) be the zero-sum game where: I = J =
IN, and for each couple of non negative integers (i, j):
g(i, j) =
_
1 if i j
1 if i < j
We have = 1 and = 1, hence the game has no value. In this
game, each player has to choose a number and the player picking the highest
number wins. One can consider that playing well means nothing here.
Chapter 2
Extensive games
2.1 Introduction
In extensive-form games, we pay attention to the explicit time structure of
the interaction. Roughly speaking, an extensive (or extensive-form) game is
a game represented by a picture with a tree. Three important aspects may
be present:
- dynamic aspect (several stages),
- simultaneity of some decisions (as for strategic games),
- players may have incomplete information (we add player 0, called na-
ture).
2.2 Model
We only present the model for nite extensive-form games.
Denition 2.2.1. An extensive-form game is dened by the following el-
ements, from 1) to 6):
1) Players: N = {1, ..., n} + player 0 (nature), where n is a positive
integer.
2) A tree: T = (X, r, ), where:
X is a nite set called the set of nodes. A node represents a possible
position of the game.
39
r X is a particular node called the root. The game starts there.
is a mapping from X\{r} to X called the predecessor mapping. For
each node x in X, (x) is called the predecessor of x. We impose that each
node eventually comes from the root, i.e. we assume that for each node x,
there exists m 0 such that
m
(x) = r (where
m
(x) =
def
((.....(x)))),
with iterated m times).
For a node x, we write
1
(x) = {y X, (y) = x}. We call
1
(x) the
set of successors of the node x.
We put T = {x X,
1
(x) = }. We call T the set of terminal nodes of
the game. When the play reaches a node in T, the game ends.
3) Information structure:
3a) Who plays when ?
We have a partition of non terminal nodes: (X
0
, ..., X
n
) is a partition of
X\T satisfying: i N, X
i
= . X
i
is the set of nodes where player i plays.
For every node x in X\T, we denote by i(x) the player in N {0} who
plays when the play is at x.
3b) Who knows what ?
For every player i in N {0}, we have a partition U
i
of X
i
. An element
u in U
i
is called an information set of player i, and has to be non empty.
Interpretation: When the play is at an information set u U
i
, player i
has to play. He knows that the play is at some node in u, but does not know
at which node in u. Player i can not distinguish among the dierent nodes
in u.
Notation: U =
iN{0}
U
i
. The set U is the set of all information sets
of the game.
4) Decision structure:
For every player i in N {0}, for every information set u U
i
of player
i, we have a set of actions A
u
and a mapping
u
from u A
u
to
xu
1
(x)
such that: for each node x in u,
u
(x, .) is a bijection from A
u
to
1
(x).
A
u
is the set of available actions of player i at u. When the play is at u,
player i has to pick an element in A
u
. The mapping
u
is called the transition
mapping at u. The element
u
(x, a) is the node reached after x if player i
has played a.
Interpretation: imagine that the game is in x, with x u and u U
i
. It
is player is turn to play, he knows that the play is in u, but does not know
at which node of u. Player i chooses an action a in A
u
, and the play goes at
u
(x, a).
5) Particular case of Player 0 (nature):
For each x in X
0
, it is assumed that {x} is an information set of player
0: nature is always informed of everything. Formally, U
0
= {{x}, x X
0
}.
In a game where nature plays no role, we have X
0
= and then U
0
= .
Nature has a xed strategy. If the game is at x in X
0
, nature chooses a
in A
{x}
according to the exogenous probability
0
{x}
. We assume each action
a in A
{x}
has a positive probability to be selected under
0
{x}
. We denote by
0
= (
0
{x}
)
xX
0 the xed strategy of nature.
Nature has no payo, see below.
6) Payos: for each player i in N, we have a payo function
i
: T
IR. The mapping
i
associates to each terminal node the payo of player i.
Nature has no payo.
=
_
N, (X, r, ), (X
i
, U
i
)
iN{0}
, (A
u
,
u
)
uU
, (
0
{x}
)
xX
0, (
i
)
iN
_
is an
extensive-form game. This ends denition 2.2.1.
Progress of the game :
The play starts at the root r X\T.
Assume the game is at a certain node x in X\T. There is a player i in
N {0}, and a unique information set u of player i containing x. Two cases
are possible:
1) If i N, it is player is turn to play. He knows that the play is at
some node in u, but does not knows which node in u has been reached. He
chooses an action a in A
u
, and the play goes to
u
(x, a), which is an element
of
1
(x).
2) If i = 0, nature plays at x. We necessarily have u = {x}. Na-
ture chooses a in A
u
according to the probability
0
{x}
, and the play goes to
u
(x, a)
1
(x).
In both cases, if the new node
u
(x, a) belongs to X\T, the play contin-
ues. Otherwise,
u
(x, a) is a terminal node: the play ends and each player i
in N receives the payo
i
(
u
(x, a)).
In all the following, we x an extensive-form game , and we keep the
previous notations.
Denition 2.2.2. A play of is a nite sequence = (x
0
, ..., x
l
) with l 1,
x
0
= r, x
l
T, and x
k
= (x
k+1
) for each k in {0, ..., l 1}. We denote by
the set of all possible plays of .
Notice that there is a bijection between plays and terminal nodes. Hence,
we can dene payos for plays, and we will put:
i
() =
i
(x
l
) if =
(x
0
, ..., x
l
).
2.3 Associated strategic form
Denition 2.3.1. A pure strategy of a player i in N species, for each in-
formation set u in U
i
, the action to be chosen by player i at u. Consequently,
it is an element s
i
= (s
i
u
)
uU
i , with s
i
u
A
u
for each u in U
i
. The set of
pure strategies of player i is denoted by:
S
i
=
uU
i
A
u
.
Fix a pure strategy prole s = (s
i
)
iN
. The prole s naturally induces a
probability distribution P
s
on the set of plays . Formally, P
s
is dened as
follows.
Let = (x
0
, ..., x
l
) be a play, and for each k in {0, ..., l 1} denote by
i
k
the player who plays at x
k
, by u
k
U
i
k
the information set of player i
k
which contains x
k
, and by a
k
A
u
k
the action such that
u
k
(x
k
, a
k
) = x
k+1
.
We rst dene the probability that player i
k
plays a
k
at u
k
by: if i
k
N, we
put
i
k
u
k
(a
k
) = 1 if s
i
k
u
k
= a
k
(i.e. if player i
k
plays a
k
at u
k
), and
i
k
u
k
(a
k
) = 0
otherwise. If i
k
= 0, the probability that nature plays a
k
at u
k
is simply
denoted by
i
k
u
k
(a
k
) [0, 1]. The probability of if s is played is then dened
by:
P
s
() =
i
0
u
0
(a
0
)
i
1
u
1
(a
1
)...
i
l1
u
l1
(a
l1
).
The expected payo for player i when s is played is then denoted by:
g
i
(s) = IE
Ps
(
i
) =
P
s
()
i
().
Remarks: 1) P
s
() [0, 1], and
P
s
() = 1. 2) If nature plays no role,
then s induces a unique play with probability 1 (examples: chess, matching
pennies).
Denition 2.3.2. The strategic game associated to is the game:
G = (N, (S
i
)
iN
, (g
i
)
iN
).
By denition, a Nash equilibrium of is a Nash equilibrium of G.
Since we assumed that the set of nodes is nite, then for each player i in
N, the set S
i
is nite and so G is a nite strategic-form game. We can dene
the mixed extension of G as usual.
Notation 2.3.3. The set of mixed strategies of player i is denoted by:
i
= (S
i
) =
_
uU
i
A
u
_
.
By Nashs theorem, we know that every nite strategic-form game has a
Nash equilibrium in mixed strategies, hence our extensive-form game has
a Nash equilibrium in mixed strategies.
2.4 Games with perfect information
Denition 2.4.1. Let be an extensive-form game. has perfect informa-
tion if:
nature plays no role: X
0
= .
each information set is a singleton: i N, x X
i
, {x} U
i
.
In a perfect information game, whenever a player has to play, he perfectly
knows everything which happened before, and choices are never simultane-
ous. Examples: chess, exercises 6.2.1, 6.2.2.
We use the same notations as before, a perfect information game is given
by: -a set of players N = {1, ..., n}, - a tree T = (X, r, ) - a partition
(X
1
, ..., X
n
) of X\T such that X
i
= for each i in N, - information sets are
now determined by U
i
= {{x}, x X
i
} for each player i, - for each node x in
X\T, we have a set A
{x}
of available actions at x, and a bijective transition
function from A
{x}
to the set
1
(x) of successors of x, - and lastly, for each
player i in N, a payo function
i
: T IR.
Fix a game with perfect information. It is easy to dene the notion of
subgame starting at a non terminal node. More precisely, let z be in X\T,
the subgame starting at z is the perfect information game (z) dened by:
X(z) = {x X, m 0,
m
(x) = z} (z and nodes after z),
N(z) = {i N, X
i
X(z) = } (players playing at and after z),
r(z) = z (the game (z) starts at z),
natural restrictions of other elements of : the new predecessor function
(z) satises (z)(x) = (x) for each x in X(z)\{z}, the new set of nodes
of player i is X
i
(z) = X
i
X(z), at each node x in X(z) we have the same
available actions and transitions as in , and the payos are dened as before
(this is possible since each terminal node of (z) also is a terminal node of
).
It is also easy and natural to dene restrictions of strategies to subgames.
More precisely, let s
i
= (s
i
{x}
)
xX
i be a pure strategy of some player i in the
game . If i N(z), we dene s
i
(z) the restriction of s
i
to the subgame
(z) as follows: s
i
(z) plays the same actions as s
i
, but is only dened for
nodes of (z), so that s
i
(z) = (s
i
{x}
)
xX
i
(z)
. If now s = (s
i
)
iN
is a pure
strategy prole of , we dene the restriction s(z) of s in the subgame (z)
by s(z) = (s
i
(s))
iN(z)
. And s(z) now is a strategy prole in (z).
Denition 2.4.2. Let s be a pure strategy prole in . s is a subgame-perfect
equilibrium (SPE) of if for each node z in X\T, the strategy prole s(z)
is a Nash equilibrium of the subgame (z).
Lemma 2.4.3. A subgame perfect equilibrium of is a Nash equilibrium of
.
The reciprocal is false (see exercise 6.2.1).
Theorem 2.4.1. (Kuhn-Zermelo 1913) Any nite extensive-form game with
perfect information has a subgame perfect equilibrium in pure strategies.
Proof: by induction on the number of nodes.
- If |X| = 2, there are 2 nodes, and a single player with a unique action
going from the root to the unique terminal node. OK
- Consider l 2, and assume the theorem is proved for any game with at
most l nodes. Let be a perfect information game with l + 1 nodes. Write
x
0
= r (root of the game), and denote by i
0
the player which plays at x
0
.
For every x
1

1
(x
0
), let s(x
1
) be a SPE of (x
1
), and denote by g(x
1
) the
corresponding payo for player i
0
.
Dene a prole of pure strategies s = (s
i
)
iN
by: - for every player i = i
0
,
and for every x in X
i
, s
i
plays at node x the action played by s
i
(x
1
) at node
x, where x
1
is the unique successor of x
0
such that x X(x
1
).
- consider player i
0
. For every node x in X
i
0
\{r}, similarly as before s
i
0
plays at node x the action played by s
i
0
(x
1
) at node x, where x
1
is the unique
successor of x
0
such that x X(x
1
). It remains to see what s
i
0
plays at the
root. Fix x
1
in
1
(x
0
) such that g(x
1
) = max
x
1
1
(x
0
)
g(x
1
). By denition,
s
i
0
plays at x
0
the action associated to the transition from the root to x
1
.
Do we have a SPE ?
Let z be a non terminal node. We have to show that s(z) is a NE of (z).
Case 1: Assume that z = x
0
. Let x
1

1
(x
0
) be such that z X(x
1
).
s(x
1
) is a SPE of (x
1
), hence the restriction of s(x
1
) to (z) is a NE, hence
the restriction of s to (z) is a NE.
Case 2: Is s a NE of ? It is clear that every player i = i
0
is in best
reply using s
i
against s
i
. Suppose now that player i
0
plays a strategy s
i
0
in S
i
0
leading to a successor of x
0
denoted by x
1
. Then the payo of player
i
0
using s
i
0
against s
i
0
in is at most g(x
1
), because s(x
1
) is a NE of
(x
1
). Consequently, against s
i
0
the payo for player i
0
in is at most
max
x
1
1
(x
0
)
g(x
1
). Since it is the payo obtained by player i
0
using s
i
0
, this
strategy is a best reply against s
i
0
.
Method to compute SPE: Backwards induction (when the tree is not
too large !)
Remark: For zero-sum games with perfect information, Kuhns theorem im-
plies that the game has a value in pure strategies. For zero-sum games with
perfect information where the payo of player 1 is either 1 (win for player
1) or -1 (win for player 2), it implies that one of the players has a winning
strategy. What is the value for chess ?
2.5 Behavior strategies
We x a nite extensive-form game
=
_
N, (X, r, ), (X
i
, U
i
)
iN{0}
, (A
u
,
u
)
uU
, (
0
{x}
)
xX
0, (
i
)
iN
_
, as dened
in denition 2.2.1. A pure strategy of a player i in N is an element s
i
=
(s
i
u
)
uU
i S
i
=
uU
i
A
u
, and a mixed strategy of player i is an element of
i
= (S
i
).
Denition 2.5.1. A behavior strategy of a player i in N species, for each
information set u in U
i
, the lottery on A
u
played by player i at u. Conse-
quently, it is an element
i
= (
i
u
)
uU
i , with
i
u
(A
u
) for each u in U
i
.
The set of behavior strategies of player i is denoted by:
i
=
uU
i
(A
u
).
If
i
is a behavior strategy for player i, u is an information set of player i
and a A
u
, we denote by
i
u
(a) the probability that player i using
i
plays a
when the game is at u.
Interpretation: When player i plays a mixed strategy
i
in
i
, he randomly
chooses a pure strategy at the beginning of the game, and then sticks to this
pure strategy. When he plays a behavior strategy
i
i
, at each informa-
tion set u where he has to play, player i chooses his action according to the
probability
i
u
. (A unique global lottery versus several local lotteries)
Notice that a pure strategy of player i can be seen both as a mixed strat-
egy and as a behavior strategy.
Example 2.5.2. : a single player. What is the dierence between mixed
and behavior strategies here ?
@
@
@
@
@
@
@
@
@
@
@
@
P1
P1
T B
L R
c
c
Fix a behavior strategy prole = (
i
)
iN
. The prole naturally
induces a probability distribution P
on the set of plays . Formally, P
is
dened as follows (in way similar to the denition for pure strategies).
Let = (x
0
, ..., x
l
) be a play, and for each k in {0, ..., l 1} denote by
i
k
the player who plays at x
k
, by u
k
U
i
k
the information set of player i
k
which contains x
k
, and by a
k
A
u
k
the action such that
u
k
(x
k
, a
k
) = x
k+1
.
The probability of if is played is then dened by:
P
() =
i
0
u
0
(a
0
)
i
1
u
1
(a
1
)...
i
l1
u
l1
(a
l1
).
It can easily be shown that P
really is a probability, i.e. that P
() [0, 1],
and
() = 1. The expected payo for player i when is played is

now denoted by:
g
i
() = IE
P
(
i
) =
()
i
().
Remark: It is also possible to dene, in a natural way, the probability on plays
induced by strategy proles where some of the players play mixed strategies,
and other players play behavior strategies.
Recall that we denoted by G the game played with pure strategies. We
now have another strategic-form game, the game with behavior strategies:
G = (N, (
i
)
iN
, ( g
i
)
iN
).
Questions: what is the link between

G and the mixed extension of G ? What
is the good notion of random strategies here ?
Example 2.5.3. : the absent-minded driver. (Complete description of the
example to be added)
Fortunately, in usual cases, mixed strategy and behavior strategies are
equivalent.
Denition 2.5.4. Let x be a node of player i, we denote by h
i
(x) the sequence
of information sets of player i and actions taken by player i strictly before
arriving at x. If Player i is playing for the rst time at x, then we have
h
i
(x) = .
The game has perfect recall if i N, u U
i
, x u, y u,
h
i
(x) = h
i
(y). idea: players never forget what they know.
Theorem 2.5.1. (Kuhns theorem). Assume that has perfect recall. Con-
sider a player i in N.
Each mixed strategy
i
of player i can be represented by a behavior strategy
of player i, in the sense that there exists a behavior strategy
i
of player i sat-
isfying: for any strategy prole
i
of the other players (
i
being composed
of mixed, pure, or behavior strategies of the other players), the probability in-
duced by (
i
,
i
) on is the same as the probability induced by (
i
,
i
) on
. And vice-versa by exchanging everywhere the words mixed and behavior.
Remarks: This implies that

G and the mixed extension of G have the same
Nash equilibrium payos. The proof of Kuhns theorem is omitted. In prac-
tice, when has perfect recall, we will assimilate behavior and mixed strate-
gies and, depending on the context, we will use the simplest notion to deal
with.
2.6 Sequential Rationality
In the sequel we will assume that is a nite extensive form game with
perfect recall. To compute the Nash equilibria of , it is often easier to use
behavior strategies.
Consider a player i in N, and an information set u of player i: u =
{x
1
, ..., x
p
}.
Player i has to play at u, but does not know at which node of u the game
is. A belief of player i at u is a probability
u
over the nodes of u.
Fix a belief
u
(u). We dene the sub-game (u,
u
) as the extensive
form game where:
- nature rst chooses a node x in u according to
u
,
- then the game is at x and continues as in .
Formally, to dene (u,
u
), we add a new node r
u
U
0
, with (x) = r
u
for
each x in u. In the game (u,
u
), the root is r
u
, the set of nodes is X
u
=
{x X, l 0,
l
(x) = r
u
}, the set of players is N
u
= {j N, X
j
X
u
= },
the nodes of a player j in N
u
are X
j
u
= X
j
X
u
, the information sets of
player j are elements of U
j
u
= {v X
u
, v U
j
s.t. v X
u
= }. And the rest
(payos, actions, transitions...) is dened by restriction from the objects of .
Let = (
j
)
jN
be a behavior strategy prole in . For each player j
in N, we have
j
= (
j
v
)
vU
j
(A
v
). We dene
j
u+
as the restriction of
j
to the information sets of player j coming after (and including) u: we put
j
u+
= (
j
v
)
vU
j
u
, where for v = w X
u
in U
j
u
with w U
j
,
j
v
is dened as
j
w
. We denote by
u+
= (
j
u+
)
jNu
, and by g
i
u,u
(
u+
) the payo of player i
induced by the strategy prole
u+
in the game (u,
u
).
Denition 2.6.1. is sequentially rational at u U
i
with respect to the
belief
u
if for every strategy
i
u+
of player i in (u,
u
), we have:
g
i
u,u
(
u+
) g
i
u,u
(
i
u+
,
i
u+
).
It is optimal for player i at u to continue to play according to
i
if: (1)
the belief of player i at u is
u
, and (2) the other players continue to play
according to .
Example 2.6.2.
H
H
H
H
H
H
H
H
H
H
H
H
H
J
J
J
J
J
J
J
J
J
J
J
J
P1
P2
(0, 1) (2, 0) (0, 0) (2, 1)
(1, 2)
r
x
1
x
2
L M R
a a b b
c
c c
Consider the pure strategy

2
= a of player 2. For which belief is
2
sequentially rational at {x
1
, x
2
} ?
We denote by P
(u) the probability that the game goes through u if is

played. Since u = {x
1
, ..., x
p
}, we simply have P
(u) = P
(x
1
) +... +P
(x
p
).
If P
(u) > 0, it is natural to assume that the belief of player i at u is given

by Bayes rule:
k {1, ..., p},
u
(x
k
) = P
(x
k
|u) =
P
(x
k
)
P
(u)
.
In short, we have
u
= P
(.|u) (u).
Notice that we have:
g
i
u,u
(
u+
) = IE
P
(g
i
|u).
The continuation payo of player i after u is the conditional expected payo
of player i knowing that the game goes through u.
Example 2.6.2 again : Is
2
sequentially rational at {x
1
, x
2
} with the belief
induced by the strategy
1
= 1/4L + 1/2M + 1/4R of player 1 ?
The following result is important in practice to compute Nash equilibria
with behavior strategies.
Theorem 2.6.1. Let be a behavior strategy prole, and let i be a player
in N.
i
is a best reply against
i
if and only if for each u in U
i
such that
P
(u) > 0, is sequentially rational at u with respect to P
(.|u).
The following corollary is then immediate.
Corollary 2.6.3. is a Nash equilibrium of if and only if for each player i
in N, for each information set u in U
i
such that P
(u) > 0, is sequentially

rational at u with respect to P
(.|u).
Proof of the theorem:
= Assume that
i
is a best reply against
i
. For the sake of contradic-
tion, let v in U
i
, and
i
v+
in
i
v+
be such that P
(v) > 0 and g

i
v,v
(
i
v+
,
i
v+
) <
g
i
v,v
(
i
v+
,
i
v+
), with
v
= P
(.|v). We will show that the strategy

i
such
that player i plays as
i
if the game does not reach v, and as
i
v+
as soon as
the game reaches v, is a protable deviation for player i.
Let U
i
v
be the set of information sets of player i in (v,
v
). Since player
i has perfect recall, U
i
v
U
i
. We have
i
v+
= (
i
v+,u
)
uU
i
v
. Dene now the
behavior strategy
i
= (
i
u
)
uU
i of player i in as:

i
u
=
i
u
if u / U
i
v
,
i
u
=
i
v+,u
if u U
i
v
.
We have P
(x) = P

i
,
i (x) for each x in v, hence P
(v) = P

i
,
i (v) > 0,
and
v
= P
(.|v) = P

i
,
i (.|v). By conditioning payos (v
c
denotes the
complementary of the event: the game reaches v), we have:
g
i
() = P
(v) g
i
v,v
(
i
v+
,
i
v+
) +P
(v
c
) IE
(g
i
|v
c
)
g
i
(
i
,
i
) = P

i
,
i (v) g
i
v,v
(
i
v+
,
i
v+
) +P

i
,
i (v
c
) IE

i
,
i (g
i
|v
c
)
Since P
(v) = P

i
,
i (v) > 0 and IE
(g
i
|v
c
) = IE

i
,
i (g
i
|v
c
), we obtain
g
i
(
i
,
i
) > g
i
(), hence a contradiction with being a Nash equilibrium.
= The other part of the proof is more interesting. Assume now that
i
is
not a Best Reply against
i
. Denote by A = {
i
i
, g
i
(
i
,
i
) > g
i
()} =
. Take
i
= (
i
u
)
uU
i in A that minimizes the number of information sets u
in U
i
such that
i
u
=
i
u
. Fix u such that
i
u
=
i
u
.
One can dene, since player i has perfect recall, the rst information set v
for player i in the path from the root to any x in u. Then P
(v) = P
i
,
i (v)
is necessarily positive, otherwise P
(u) = P
i
,
i (u) = 0 and
i
does not
satisfy the minimality condition. We also have P
(.|v) = P
i
,
i (.|v), and we
denote this probability by
v
. We have as in the previous part of the proof:
g
i
() = P
(v) g
i
v,v
(
i
v+
,
i
v+
) +P
(v
c
) IE
(g
i
|v
c
)
g
i
(
i
,
i
) = P
(v) g
i
v,v
(
i
v+
,
i
v+
) +P
(v
c
) IE
i
,
i (g
i
|v
c
)
We have g
i
(
i
,
i
) > g
i
(). Assume g
i
v,v
(
i
v+
,
i
v+
) g
i
v,v
(
i
v+
,
i
v+
). Then
one must have P
(v
c
) > 0, and IE
i
,
i (g
i
|v
c
) > IE
(g
i
|v
c
). By consider-
ing
i
in
i
that plays as
i
at and after v and as
i
otherwise, we obtain
that g
i
(
i
,
i
) > g
i
(). We get a contradiction because of the minimal-
ity condition for
i
, hence our assumption was false, and g
i
v,v
(
i
v+
,
i
v+
) <
g
i
v,v
(
i
v+
,
i
v+
), and is not sequentially rational at v. This concludes the
proof.
2.7 Subgame-perfect, Bayesian-perfect and Se-
quential Equilibria
2.7.1 Subgame-perfect equilibria
Denition 2.7.1. Let u be an information set of some player i in N such
that u is a singleton: u = {x}. There is a unique possible belief at u, it is the
Dirac measure
x
. We simply denote by (u) = (u,
x
) the subgame starting
at x. If X
u
contains each information set w of such that X
u
w = , we
say that (u) is a proper subgame of .
Denition 2.7.2. A subgame-perfect equilibrium (SPE) of is a behavior
strategy prole such that for each proper subgame (u),
u+
is a Nash
equilibrium of (u).
Lemma 2.7.3. A SPE of is a Nash equilibrium of .
Proof: {r} U since we assumed perfect recall, so ({r}) is a proper
subgame, and it coincides with .
2.7.2 Bayesian-perfect equilibria
Denition 2.7.4. A system of beliefs is an element = (
u
)
uU
with
u

(u) for each u in U. A strategy prole is sequentially rational with respect
to the system of beliefs if i N, u U
i
, is sequentially rational at
u with respect to
u
. A system of beliefs is said to be compatible with a
strategy prole if for each u in U such that P
(u) > 0,
u
= P
(.|u).
Denition 2.7.5. A behavior strategy prole is a Bayesian-perfect equi-
librium (BPE) of if there exists a system of beliefs such that:
is compatible with , and
is sequentially rational with respect to .
Lemma 2.7.6. A BPE of is a Nash equilibrium of .
Proof: Apply corollary 2.6.3.
2.7.3 Sequential equilibria
Sequential equilibria are dened as BPE, but with a stronger notion of com-
patibility that will be called consistency. Given a completely mixed strat-
egy prole (i.e. a strategy prole where each action is played with positive
probability) , there is a unique system of belief, denoted by (), which is
compatible with .
Denition 2.7.7. A system of beliefs = (
u
)
uU
is said to be consistent
with a strategy prole if there exists a sequence (
t
)
tIN
of completely mixed
strategy proles such that
t

t
and (
t
)
t
.
Denition 2.7.8. A behavior strategy prole is a sequential equilibrium of
is there exists a system of beliefs such that:
is consistent with , and
is sequentially rational with respect to .
Theorem 2.7.1.
1) Every sequential equilibrium is a BPE.
2) Every sequential equilibrium is a SPE.
Proof:
1) We show that consistency of beliefs implies compatibility. Fix a
prole of behavior strategies, and a system of beliefs that is consistent
with . Let (
t
)
tIN
be as in denition 2.7.7. Let u be an information set
such that P
(u) > 0. We have for any x in u, P

t
(x)
t
P
(x), hence
(
t
)
u

t
P
(.|u). So
u
= P
(.|u), and is compatible with .

2) Let be a sequential equilibrium. Take (
t
)
tIN
and as in denitions
2.7.7 and 2.7.8. Fix v = {x} an information set such that (v) is a proper
subgame of . We place ourselves in (v).
For any player i in N
v
, the set of information sets of player i is U
i
v
= {u
U
i
, u X
v
= } U
i
. Hence the restriction of any behavior strategy prole
= (
i
)
iN
to (v) is just
v+
= (
i
v+
)
iNv
with
i
v+
being the projection of
i
on
uU
i
v
u for any player i in N
v
. So
t,v+

v+
, and
t,v+
(restriction
of
t
to (v)) is completely mixed for each t.
Let u = {x
1
, .., x
p
} be an information set of (v). For any k = 1, .., p,
P
t,v+
(x
k
) =
P
t
(x
k
)
P
t
(v)
, hence (
t,v+
)
u
(x
k
) =
P
t
(x
k
)
P
t
(u)
and (
t,v+
)
u
= (
t
)
u
.
Thus (
t,v+
) is the restriction of (
t
) to the information sets of (v),
hence (
t,v+
) converges as t goes to innity to
v+
, dened as the restriction
of to the information sets of (v). And it is plain that in (v),
v+
is
sequentially rational with respect to
v+
.
All this shows that
v+
is a sequential equilibrium of (v). Point 1) of
this proof shows it is a BPE, hence a Nash equilibrium, of (v). is thus a
SPE of .
To summarize, we have the following implications:
Sequential Equilibrium = BPE = Nash Equilibrium
Sequential Equilibrium = SPE = Nash Equilibrium
And it should be clear that in some cases life is easier.
Lemma 2.7.9. Let be a behavior strategy prole such that for each player
i in N and each information set u in U
i
, we have P
(u) > 0. Then the

following properties are equivalent:
is a Nash equilibrium of ,
is a BPE of ,
is a SPE of ,
is a sequential equilibrium of .
A corollary of the following result will be the existence of both SPE and
BPE.
Theorem 2.7.2. Every nite extensive-form game has a sequential equilib-
rium.
Proof: Uses the existence of trembling-hand perfect equilibria (see ex-
ercise 6.1.23) for the normal-agent form of .
We rst dene the Normal-Agent form of as the strategic game NA()
= (M, (A
m
)
mM
, (
m
)
mM
), where the set of players is M = {(i, u), i
N, u U
i
}, the set of actions of player (i, u) is A
u
, the payo function of
player (i, u) is given by the payo function of player i in the strategic form
of . The idea is that every player i is divided into a team of players, one
for each information set of player i.
NA() is a nite strategic game, hence it has a trembling-hand perfect
equilibrium as dened in exercise 6.1.23, i.e. if there exists a sequence
(
t
,
t
)
tIN
such that : t,
t
is a
t
-perfect equilibrium (and in particular is a
completetely mixed strategy),
t

t
0 and
t

t
. One can show
by contradiction that there exists t
0
such that for t t
0
, is a best response
against
t
in NA(), and we assume without loss of generality that t
0
= 0.
Formally, = (
(i,u)
)
(i,u)M
with
(i,u)
(A
u
) for all (i, u). For each
player i, this obviously denes a behavior strategy
i
in the original game
(for each information set u U
i
, let
i
u
be
(i,u)
) and we identify with the
behavior strategies prole (
i
)
iN
. For each t, we similarly identify
t
with
a completely mixed behavior strategies prole (
i
t
)
iN
in . We nish the
proof by showing that is a sequential equilibrium of . By considering a
convergent subsequence, one may assume that (
t
) converges as t goes to
innity to some system of beliefs . We prove that is sequentially rational
with respect to .
Fix some player i in N and u in U
i
. For any behavior strategy
i
=
(
i
v
)
vU
i of player i, we denote by
i
>u
the restriction of
i
to the information
sets that come stricly after u:
i
>u
= (
i
v
)
vU
i
u
\{u}
. For any t in IN, we put
t,u
= (
t
)
u
for short. The best response condition for player (i, u) in NA()
gives that:
i
u
(A
u
),
g
i
u,t,u
(
i
u
,
i
t,>u
,
i
t,u+
) g
i
u,t,u
(
i
u
,
i
t,>u
,
i
t,u+
)
hence by taking the limit as t goes to innity:
g
i
u,u
(
i
u+
,
i
u+
) g
i
u,u
(
i
u
,
i
>u
,
i
u+
) ()
We nally show that
i
u+

i
u+
:
g
i
u,u
(
i
u+
,
i
u+
) g
i
u,u
(
i
u+
,
i
u+
) ()
Assume () is not satised for some
i
u+
, and x such
i
u+
that minimizes
the cardinal of A(
i
u+
) =
def
{v U
i
u
,
i
v
=
i
v
}. Fix now v in A(
i
u+
) such that
for all w distinct from v in A(
i
u+
), no path from the root to some element
of w goes through v. v can be seen as a last node where
i
u+
and
i
u+
dier.
Fix t in IN and consider the subgame (u,
t,u
). If in this subgame v
is reached with positive probability if (
i
u+
,
i
t,u+
) is played, the conditional
probability on v does not depend on player is strategy (since player i has per-
fect recall) hence equals the conditional probability obtained if (
i
t,u+
,
i
t,u+
)
is played, that is
t,v
. We now place ourselves in (u,
u
). Denote by p the
probability in [0, 1] that in (u,
u
), v is reached if (
i
u+
,
i
u+
) is played. If
p > 0, by letting t go to innity in the previous argument, we have that
the conditional probability on v if (
i
u+
,
i
u+
) is played in (u,
u
) is
v
. v
c
denoting the event v is not reached, we can thus write, p being positive or
not:
g
i
u,u
(
i
u+
,
i
u+
) = p g
i
v,v
(
i
v
,
i
>v
,
i
v+
) + (1 p) IE
i
u+
,
i
u+
(g
i
u,u
|v
c
)
Dene now
i
u+
as the strategy (
i
w
)
wU
i
u
s.t.
i
v
=
i
v
and
i
w
=
i
w
for
w = v.
i
u+
plays as
i
u+
except in v where it plays as
i
. We also have:
g
i
u,u
(
i
u+
,
i
u+
) = p g
i
v,v
(
i
v
,
i
>v
,
i
v+
) + (1 p) IE
i
u+
,
i
u+
(g
i
u,u
|v
c
)
Using () for v and
i
v
, we obtain that g
i
u,u
(
i
u+
,
i
u+
) g
i
u,u
(
i
u+
,
i
u+
), which
is greater than g
i
u,u
(
i
u+
,
i
u+
) by assumption. Since A(
i
u+
) has one element
less than A(
i
u+
), we get a contradiction, and () is proved for all
i
u+
.
Hence is sequentially rational with respect to , and thus is a sequential
equilibrium.
Chapter 3
Bayesian games and games
with incomplete information
3.1 Modeling incomplete information
When we study a strategic-form game G, we have in mind that all the players
know the game. What happens when it is not the case ? A player may not
perfectly know: a) the payos of the other players (or even his own payo,
ex: some auctions, options), b) the available actions for the other players, or
c) even the set of players.
To model a strategic interaction, we may have to consider a family of
games (G
k
)
kK
, where k K is a parameter called state of nature. Some
state k is the true state. The game G
k
will be played, and the players have
private information on k. To have a complete description of the situation, it
is necessary to know:
- the belief of each player on the true state k,
- the belief of each player on the beliefs of the other players on k,
- etc.
Fortunately, one can show (Harsanyi 1967-68, Mertens-Zamir 1985) that
is is possible to dene for each player i a universal type space T
i
, with the
property that a type t
i
of player i represents everything known by player i.
This leads to the following denition of a Bayesian game.
57
3.2 Bayesian games
Denition 3.2.1.
A nite Bayesian game is an element G
B
= (N, (A
i
)
iN
, (T
i
)
iN
, (p
i
)
iN
, (g
i
)
iN
)
where:
N = {1, ..., n} is the set of players
i N, A
i
is the set of actions of Player i
i N, T
i
is the set of types of Player i,
i N, p
i
: T
i
(T
i
) associates to each type of Player i his belief
over the types of the other players.
i N, g
i
: A T IR gives the payo of Player i as a function of
action proles and type proles,
N, A
i
and T
i
are non empty nite sets for each i, and as usual we put
A =
iN
A
i
and T =
iN
T
i
.
Interpretation: - each player i has a true type t
i
, and knows his true type.
His information on other players types is given by p
i
(t
i
), - each player i
chooses a
i
A
i
(the choices are simultaneous), - and nally, the payo
of player i is g
i
(a, t), where a = (a
j
)
jN
, t = (t
j
)
jN
. The game G
B
is
known by all the players.
Remarks:
1) It is a generalization of the denition of a (nite) strategic-form game.
2) The set of nature has disappeared here, since it is always possible to
include it into types,
3) The incomplete information on the sets of actions or players has also
disappeared. It is generally possible to incorporate this by: - giving an ex-
tremely low payo to an action which should not be available for some type,
- considering payos which do not depend on the action of a player which is
absent.
Examples of beliefs: with 2 players, N = {1, 2}.
1) T
1
= {a}, T
2
= {b}.
Then necessarily p
1
: T
1
(T
2
) is such that p
1
(a) =
b
, and p
2
:
T
2
(T
1
) is such that p
2
(b) =
a
. This is the classic case of complete
information. The type vector (t
1
, t
2
) = (a, b) is common knowledge, this is a
usual strategic form game.
2) T
1
= {a, b}, T
2
= {c}, p
2
(c) = 1/2
a
+ 1/2
b
.
We will see that this game is equivalent to the extensive form game
where: nature rst chooses a or b with probability (1/2, 1/2), then player
1 learns the choice of nature, player 2 does not, and nally players choose
actions as usual.
3) T
1
= {a, b}, T
2
= {c, d}, p
1
(a) =
c
, p
1
(b) =
d
, p
2
(c) =
b
, p
2
(d) =
a
.
If t = (a, c), player 1 thinks player 2 has type c, and player 2 thinks player
1 has type b (and consequently player 2 is wrong in this case).
If t = (a, d), ... , player 1 is wrong.
Bayesian games allow for the study of interactions where the beliefs of
the players are not consistent.
Denition 3.2.2. A behavior strategy of player i in G
B
is a mapping
i
:
T
i
(A
i
). We denote by
i
the set of behavior strategies of player i.
Fix now a strategy prole = (
1
, ...,
n
). For each player i and type t
i
in T
i
, we denote by IE
(g
i
|t
i
) the expected payo of player i with type t
i
if
the other players use
i
:
IE
(g
i
|t
i
) =
t
i
T
i
p
i
(t
i
)(t
i
) g
i
_
1
(t
1
), ...,
n
(t
n
), t
1
, ..., t
n
_
.
Remarks:
1) We will see examples where the type spaces are innite. There we will
have:
IE
(g
i
|t
i
) =
_
t
i
T
i
g
i
_
1
(t
1
), ...,
n
(t
n
), t
1
, ..., t
n
_
dp
i
(t
i
)(t
i
).
2) IE
(g
i
) is not dened.
Denition 3.2.3. A strategy prole = (
1
, ...,
n
) is a Bayesian equilib-
rium of the game G
B
if for each i and t
i
, it is optimal for player i to play
i
(t
i
) if his type is t
i
and the other players play according to
i
. Formally,
is a Bayesian equilibrium if for each i in N and t
i
in T
i
:
IE
(g
i
|t
i
) IE
i
,
i (g
i
|t
i
)
i

i
.
3.3 Games with incomplete information
Denition 3.3.1. The beliefs (p
i
)
iN
are consistent if there exists a proba-
bility p over T such that all the beliefs can be derived from p using Bayes
rule:
i N, t
i
T
i
, p(t
i
) > 0 and p
i
(t
i
) = p(.|t
i
) (T
i
).
Fix a nite Bayesian game G
B
= (N, (A
i
)
iN
, (T
i
)
iN
, (p
i
)
iN
, (g
i
)
iN
)
with consistent beliefs. Consider p (T) satisfying the above denition.
We dene an associated game with incomplete information

G as follows.
G is the extensive form game where: - rst, nature chooses t = (t

i
)
iN
according to p. Each player i learns t
i
, and uniquely t
i
, - then, the players
simultaneously choose an action in their own set of actions If a A is the
selected action prole, the payo of player i is g
i
(a, t).
Example: 2 players, 2 types and 2 actions for each player. Draw the associ-
ated extensive-form game.
Notice that the behavior strategies of player i in the game

G coincide with
the elements of
i
. Notice also that the game

G has perfect recall, hence by
Kuhns theorem mixed and behavior strategies are equivalent.
In

G, the payo of player i if the strategy prole = (
1
, ...,
n
) is played
is denoted by g
i
(). This is nothing but:
g
i
() =
t=(t
1
,...,t
n
)T
p(t) g
i
(
1
(t
1
), ....,
n
(t
n
), t
1
, ..., t
n
)
=
t
i
T
i
p(t
i
)
t
i
T
i
p(t
i
|t
i
) g
i
(
1
(t
1
), ....,
n
(t
n
), t
1
, ..., t
n
)
=
t
i
T
i
p(t
i
) IE
(g
i
|t
i
).
Theorem 3.3.1. is a Bayesian equilibrium of G
B
if and only if it is a
Nash equilibrium of

G.
Proof: = Assume that is a Bayesian equilibrium of G
B
. Fix i in N, and
i
in
i
.
g
i
(
i
,
i
) =
t
i
T
i
p(t
i
) IE
i
,
i (g
i
|t
i
)
t
i
T
i
p(t
i
) IE
(g
i
|t
i
)
g
i
().
Hence is a Nash equilibrium of

G.
= Assume that is not a Bayesian equilibrium of G
B
. Then there
exists i in N, t
i
in T
i
and
i
in
i
such that IE
(g
i
|t
i
) < IE
i
,
i (g
i
|t
i
).
Dene now
i
: T
i
(A
i
) such that
i
(t
i
) =
i
(t
i
), and for each t
i
= t
i
,
i
(t
i
) =
i
(t
i
).
i
is a strategy in
i
, and we have:
g
i
(
i
,
i
) = p(t
i
)IE
i
,
i (g
i
|t
i
) +
t
i
=t
i
p(t
i
)IE
(g
i
|t
i
) > g
i
().
(using the fact that p(t
i
) > 0)). So is not a Nash equilibrium of

G.
Regarding existence, we know by Nashs theorem that there exists a Nash
equilibrium of

G in mixed strategies. Since

G has perfect recall, there exists
a Nash equilibrium of

G in behavior strategies. Consequently there exists a
Bayesian equilibrium of our nite Bayesian game G
B
.
We have seen how to study a Bayesian game with consistent beliefs via
the study of an associated extensive-form game.
Chapter 4
Correlated Equilibrium
The idea of correlated equilibria (Aumann, 1974) is to introduce an exogenous
mediator who can send a private signal to each player before the game.
Compared to Nash equilibria in mixed strategies, the independence of the
lotteries performed by the players will no longer hold.
4.1 Examples
We start with an example: The Crossroad
L R
T
B
_
(1, 1) (1, 0)
(0, 1) (0, 0)
_
There are 2 pure Nash equilibrium payos: (1, 0) and (0, 1), and a third mixed
Nash equilibrium payo: (0, 0) corresponding to the equilibrium (1/2T +
1/2B, 1/2L+1/2R). The best symmetric feasible payo in mixed strategies
is (1/8, 1/8).
Let us modify the game by adding an exogenous mediator (nature if you
like) who will choose before the game an exogenous signal A or B with equal
probabilities, and publicly announces the outcome to the players. After this
announcement the game is played as before. This induces a new game which
extensive form is:
63
@
@
@
@
@
@
A
A
A
A
A
A
@
@
@
@
@
@
A
A
A
A
A
A
B
B
B
B
B
B
A
A
A
A
A
A
J
J
J
J
J
J
P0
P1 P1
P2 P2
r
x
1
x
2
x
3
x
4
x
5
x
6
_
1
1
_ _
1
0
__
0
1
_ _
0
0
_ _
1
1
__
1
0
__
0
1
_ _
0
0
_
B B T T
A B
R L R L R L L R
c
c c
And one can prove that (1/2, 1/2) is a Nash equilibrium payo of this new
game. Trac lights are nice !
We now consider another example, the chicken game.
L R
T
B
_
(6, 6) (2, 7)
(7, 2) (0, 0)
_
There are 3 mixed Nash equilibrium payos: (7, 2), (2, 7), and (14/3, 14/3).
Let us modify the game by adding an exogenous mediator who will choose
before the game an entry of the matrix according to the following probabili-
ties:
L R
T
B
_
1/3 1/3
1/3 0
_
The mediator will choose (T, L), (T, R) or (B, L) with equal probabilities but
will not select (B, R). The selected entry is not publicly announced to the
players, but the line of the selected entry is privately told to player 1, and
its column is privately told to player 2. Then the game is played as before.
This denes a new game, and one can show that (5, 5) is a Nash equilib-
rium payo of this new game. (5, 5) is called a correlated equilibrium payo
of the chicken game.
In general, a correlated equilibrium of a game G is a Nash equilibrium of
an extended game where before G is played, an exogeneous mediator sends
a private signal to each player.
4.2 Denitions
Let G = (N, (A
i
)
iN
, (g
i
)
iN
) be a nite strategic game, with N = {1, ..., n}.
We extend the payo functions not only to mixed strategies, but also to
correlated distributions over action proles. To be precise, given a proba-
bility Q = (Q(a))
aA
in (A) and a player i, we write g
i
(Q) = IE
Q
(g
i
) =
aA
Q(a)g
i
(a).
Denition 4.2.1.
A correlation mechanism is an element c = ((M
i
, M
i
)
iN
, p), where:
- for each player i in N, (M
i
, M
i
) is a measurable space called the message
space of player i,
- p is a probability measure on joint messages, i.e. on
_
iN
M
i
,
iN
M
i
_
.
Given a correlation mechanism c, one denes the extended game G
c
as the
extensive-form game where:
- Initially, an exogenous mediator chooses a prole of messages m =
(m
i
)
iN
according to p. Then the mediator privately announces m
i
to each
player i.
- Then, the players play G, i.e. they choose actions and receive payos
as in G.
It should be clear that the mediator has no payo, and cannot force the
players to do anything. A strategy of player i in G
c
is a measurable mapping
i
: M
i
A
i
, hence for each a
i
in A
i
we have that {m
i
M
i
,
i
(m
i
) = a
i
}
belongs to M
i
. The payo for player i in the extended game G
c
induced by
a strategy prole = (
1
, ...,
n
) is the following expectation:
g
i
c
() = IE
p
g
i
(
1
(m
1
), ...,
n
(m
n
)) =
_
m=(m
1
,...,m
n
)M
g
i
(
1
(m
1
), ...,
n
(m
n
))dp(m).
For each action prole a = (a
1
, ..., a
n
), we dene
Q
c,
(a) = p
_
iN
(
i
)
1
(a
i
)
_
=
_
m=(m
1
,...,m
n
)M
1
1
(m
1
)=a
1....1
n
(m
n
)=a
ndp(m).
Notice that Q
c,
(a) 0 and
aA
Q
c,
(a) = 1, that is Q
c,
is in the set (A)
of probability distributions over action proles in G. The payos can then
be written in the following convenient form:
g
i
c
() =
aA
Q
c,
(a)g
i
(a) = g
i
(Q
c,
).
Denition 4.2.2. A correlated equilibrium of G is a couple (c, ) where c is
a correlation mechanism and is a Nash equilibrium of the extended game
G
c
.
The distribution Q
c,
in (A) is then called a correlated equilibrium dis-
tribution. And the payo prole (g
1
(Q
c,
), ..., g
n
(Q
c,
)) in IR
n
is called a
correlated equilibrium payo.
Notation 4.2.3. We will denote by CED the set of correlated equilibrium
distributions of G, it is a subset of (A). If x = (x
1
, ..., x
n
) (A
1
) ....
(A
n
) is a mixed Nash equilibrium of G, we say that the induced probability
on action proles x
1
x
2
... x
n
is a Nash equilibrium distribution of G, and
we denote by NED the set of such distributions. Finally, NEP and CEP
will be the subsets of IR
n
respectively denoting the set of Nash equilibrium
payos and the set of correlated equilibrium payos.
One can easily show the following properties: NED CED, and Nash
equilibrium distributions correspond to independent lotteries by the players,
i.e. to product probabilities in CED. Moreover CED is convex, and con-
sequently, the convex hull of Nash equilibrium distributions is included in
CED. Hence we have: = conv (NED) CED, and similarly for payos:
= conv (NEP) CEP.
Remark: In the denition of correlated equilibrium, the words Nash equi-
librium stand for pure Nash equilibrium (of G
c
), but allowing for mixed
equilibria would not change the sets of correlated equilibrium distributions
and payos.
4.3 Canonical correlated equilibrium
A canonical correlated equilibrium is a particular correlated equilibrium
where: the message sets of the players coincide with their action sets, and
the players exactly play the message/action sent by the mediator. The in-
terpretation of the message m
i
sent by the mediator to player i is a recom-
mendation to player i, and in a canonical correlated equilibrium each player
i exactly plays his recommendation, that is plays
i
: M
i
A
i
such that
i
(m
i
) = m
i
for each i (i.e.
i
is the identity mapping on A
i
, which is called
here the faithful strategy of player i). For example, the correlated equi-
librium previously mentioned n the chicken game is a canonical correlated
equilibrium. Formally, we have the following denition:
Denition 4.3.1. A canonical correlated equilibrium of G is a couple (c, )
where c = ((A
i
)
iN
, p) is a correlation mechanism, = (
1
, ...,
n
) is a Nash
equilibrium of G
c
and for each i,
i
is the faithful strategy of player i.
We denote by CCED the set of canonical correlated equilibrium distri-
butions of G, and we now characterize this set.
Lemma 4.3.2. An element P in (A) is a canonical correlated equilibrium
distribution of G if and only if:
i N, a
i
A
i
, b
i
A
i
,
c
i
A
i
P(a
i
, c
i
)g
i
(a
i
, c
i
)
c
i
A
i
P(a
i
, c
i
)g
i
(b
i
, c
i
).
Proof: Let c = ((A
i
)
iN
, P) be a canonical correlation mechanism. P be-
longs to (A
1
... A
n
). By the characterization of Nash equilibria of
extensive-form games, see corollary 2.6.3 in chapter 2, the faithful strategy
prole is a Nash equilibrium of the extended game G
c
if and only if : for
each player i, for each action a
i
in A
i
such that P(a
i
) > 0, player i would
rather play a
i
then anything else when he receives the message a
i
. This is
equivalent to: i N, a
i
A
i
s.t. P(a
i
) > 0, b
i
A
i
,
c
i
A
i
P(c
i
|a
i
)g
i
(a
i
, c
i
)
c
i
A
i
P(c
i
|a
i
)g
i
(b
i
, c
i
).
The following result is a form of the revelation principle.
Proposition 4.3.3. Every correlated equilibrium distribution can be obtained
in a canonical way: CED = CCED.
Proof: We have to prove that CED CCED. Let (c, ) be a correlated
equilibrium, we have c = ((M
i
, M
i
)
iN
, p), for each i,
i
is measurable from
M
i
to A
i
and = (
1
, ...,
n
) is a Nash equilibrium of the extended game
G
c
. Assume for the sake of contradiction that the induced distribution Q
c,
,
simply denoted by Q here, is not in CCED. There exists i in N, a
i
and b
i
in A such that:
c
i
A
i
Q(a
i
, c
i
)g
i
(a
i
, c
i
) <
c
i
A
i
Q(a
i
, c
i
)g
i
(b
i
, c
i
).
We construct a protable deviation for player i as follows. The idea is to
play b
i
instead of a
i
whenever possible. More precisely, dene
i
: M
i
A
i
such that for each m
i
:
i
(m
i
) =
_

i
(m
i
) if
i
(m
i
) = a
i
b
i
if
i
(m
i
) = a
i
On can check that
i
is measurable because
i
was. We have g
i
c
() =
A
Q(a
)g
i
(a
), and g
i
c
(
i
,
i
) =
A
Q
(a
)g
i
(a
), where Q
= Q
c,(
i
,
i
)
.
The distribution Q
is the element of (A) such that for each a
= (a
1
, ..., a
n
)
in A:
Q
(a
) =
_
_
_
0 if a
i
= a
i
Q(a
i
, a
i
) + Q(a
i
, b
i
) if a
i
= b
i
Q(a
) if a
i
= a
i
, b
i
.
We now have:
g
i
c
() g
i
c
(
i
,
i
) =
,a
i
=a
i
Q(a
)g
i
(a
,a
i
=b
i
Q(a
i
, a
i
)g
i
(a
i
, b
i
)
=
a
i
A
i
Q(a
i
, a
i
)
_
g
i
(a
i
, a
i
) g
i
(a
i
, b
i
)
_
< 0.
Hence
i
is a protable deviation for player i, contradicting being a Nash
equilibrium of G
c
.
Corollary 4.3.4.
CED =
_
p (A), i N, a
i
A
i
, b
i
A
i
,
c
i
A
i
p(a
i
, c
i
)g
i
(a
i
, c
i
)
c
i
A
i
p(a
i
, c
i
)g
i
(b
i
, c
i
)
_
.
Notice that CED is bounded and dened by a nite number of linear
inequalities, hence has a simple geometrical structure: it is a polytope, i.e.
the convex hull of a nite number of points. This holds as well for the set
CEP of correlated equilibrium payos.
4.4 Extensive-form correlated equilibrium and
communication equilibrium
We now consider an extensive-form game with several stages. The previous
notion of correlated equilibrium has been obtained with the introduction of
a mediator sending a single message to each player before the game starts.
If one allows the mediator to send a private signal to each player at every
information set of this player, we obtain a weaker notion called extensive-
form correlated equilibrium.
H
H
H
H
H
H
H
H
H
H
H
H
H
J
J
J
J
J
J
J
J
J
J
J
J
@
@
@
@
P1
P1
P2
(5, 1) (0, 0) (0, 0) (1, 5)
(2, 2)
x
1
x
2
R L
l
r
a a b b
c
c c
c

Looking at the associated normal form, one can study the correlated equi-
librium payos for this game. It is clear that (R, r) is strictly dominated,
hence it will never be played in a correlated equilibrium. Consequently, for
any correlated equilibrium payo x = (x
1
, x
2
) of this game, we have x
2
2.
Suppose now that we add an exogenous mediator who: - keeps silent
at the beginning of the game, - if player 1 has chosen R, and just after
he has done so, selects Heads or Tails with equal probability and publicly
announces the outcome to both players. We obtain a new extended game
and one can check that (3, 3) is an equilibrium payo of this game. Hence
(3, 3) is an extensive-form correlated equilibrium payo, but not a correlated
equilibrium payo, of this game.
Similarly, one can go even farther by allowing the mediator to communi-
cate with the players, i.e. not only to send but also to receive information
from them. Suppose that in a multistage game, at every stage each player
rst sends a private message to the mediator, and then the mediator pri-
vately sends back a message to each player. We obtain another notion, even
weaker than the previous ones, called communication equilibrium (Myerson
1986, Forges 1986).
@
@
@
@
@
@
A
A
A
A
A
A
@
@
@
@
@
@
A
A
A
A
A
A
B
B
B
B
B
B
A
A
A
A
A
A
J
J
J
J
J
J
PO
P1 P1
P2
r
x
1
x
2
x
3
x
4
x
5
x
6
_
0
1
_ _
0
0
__
0
0
_ _
0
0
_ _
0
0
_ _
0
1
__
0
0
_ _
0
0
_
b
b a
a
1/2 1/2
R L R L R L L R
c
c c
Here (0, 1) is a communication equilibrium payo, but not an extensive-

form correlated equilibrium payo.
Summing up, we have the following inclusion chain for equilibrium pay-
os (EXCEP is the set of extensive-form correlated equilibrium payos,
COMEP is the set of communication equilibrium payos):
NEP conv (NEP) CEP EXCEP COMEP.
Chapter 5
Introduction to repeated games
Repeated games represent dynamic interactions in discrete time. In the most
general setup, these interactions are modeled with a markovian state variable
which is jointly controlled by the players. The game is played by stages, and
each player rst receives a private signal on the initial state. Then at every
stage, the players simultaneously choose an action in their action set. The
selected actions together with the state determine: 1) the current payos, and
2) a probability distribution over the next state and the signals received by
the players. This is a very general model, and when a player selects an action
at some stage several strategic aspects are present: 1) he may inuence his
own payo, 2) he may inuence the state process, 3) he may reveal or learn
something about the current state, and 4) he may inuence the knowledge
of all players on the current action prole played at that stage.
We mainly consider here the simplest case of repeated games: non stochas-
tic repeated games with complete information and perfect observation, or
standard repeated games for short. The same stage game is repeated over
and over, all players know the stage game and after each stage the actions
played are publicly observed.
5.1 Examples
Given a nite strategic game G called the stage game, we dene G
T
as
the game G repeated T times, the payo of the players being dened by the
average of their stage payos. G
T
is a nite extensive-form game, and we
denote by E
T
its set of mixed Nash equilibrium payos.
73
Example 5.1.1. The stage game is
L R
T
B
_
(1, 0) (0, 0)
(0, 0) (0, 1)
_
(1, 0) and (0, 1) are Nash equilibrium payos of the stage game. It is easy to
construct a Nash equilibrium of the 2-stage game with payo (1/2, 1/2): the
two players play (T, L) at stage 1, and (B, R) at stage 2.
Consequently (1/2, 1/2) E
2
. Repetition allows for the convexication
of the equilibrium payos.
Example 5.1.2. The stage game is
C
2
D
2
E
2
C
1
D
1
E
1
_
_
(3, 3) (0, 4) (10, 10)
(4, 0) (1, 1) (10, 10)
(10, 10) (10, 10) (10, 10)
_
_
The set of Nash equilibrium payos of the stage game is E
1
= {(1, 1), (10, 10)}.
One can construct a Nash equilibrium of the 2-stage game with payo (2, 2)
as follows.
In the rst stage, player 1 plays C
1
and player 2 plays C
2
. At the second
stage, player 1 plays D
1
if player 2 has played C
2
at stage 1, and he plays
E
1
(this can be interpreted as a punishment) otherwise. Similarly, player 2
plays D
2
if player 1 has played C
1
at stage 1, and he plays E
2
(punishment)
otherwise. We have dened a Nash equilibrium of the 2-stage game with
payo (2, 2).
Similarly, one can show that for each T 1, we have
T1
T
(3, 3)+
1
T
(1, 1)
E
T
. Using punishments, repetition may allow for cooperation.
Example 5.1.3. The stage game is the following prisoners dilemma.
C
2
D
2
C
1
D
1
_
(3, 3) (0, 4)
(4, 0) (1, 1)
_
We show by induction that E
T
= {(1, 1)} for each T.
Proof: The result is clear for T = 1. Assume it is true for a xed T 1,
and consider a Nash equilibrium = (
1
,
2
) of the (T + 1)-stage repeated
game. Denote by x, respectively y, the probability that at the rst stage
player 1 plays C
1
, respectively that player 2 plays C
2
. Using corollary 2.6.3,
we obtain that after every action prole played with positive probability at
stage 1, the continuation strategies induced by form a Nash equilibrium of
the remaining game. Hence by assumption the average payo associated to
these continuation strategies is (1, 1). Hence the equilibrium payo for player
1 in the T + 1-stage game is:
1
T+1
((3xy + 4(1 x)y + (1 x)(1 y)) + T) .
By playing D
1
at all stages, player 1 can make sure his expected payo will
be at least:
1
T+1
((4y + (1 y)) + T) . By the equilibrium property, we have:
3xy+4(1x)y+(1x)(1y) 4y+(1y), and we obtain x = 0. Similarly
y = 0, and the equilibrium payo in the T + 1-stage game is (1, 1).
Hence there is no possible cooperation in the standard version of the
prisoners dilemma repeated a nite number of times.
5.2 Model of standard repeated games
We x a nite strategic game G = (N, (A
i
)
iN
, (g
i
)
iN
), called the stage
game. We will study the repetition of G a great number of times, and the
innite repetition of G. At every stage the players simultaneously choose an
action in their own set of actions, then the action prole is publicly observed
and the next stage is played. As usual we put A =
iN
A
i
and g = (g
i
)
iN
.
5.2.1 Histories and plays.
A history of length t is dened as a vector (a
1
, ..., a
t
) of elements of A, with
a
1
representing the action prole played at stage 1, a
2
representing the action
prole played at stage 2, etc. The set of histories of length t is nothing but
the cartesian product A
t
= A ... A (t times), denoted by H
t
. (For t = 0
there is a unique history of length 0, and by convention H
0
is the singleton
{}).
H
T
= {(a
1
, ..., a
T
), for all t a
t
A}.
The set of all histories is H =
t0
H
t
. A play of the repeated game is
dened as an innite sequence (a
1
, ..., a
t
, ...) of elements of A, the set of plays
is denoted by H
(identical to the cartesian product A
).
5.2.2 Strategies
We dene a single notion of strategies. This notion corresponds to behavior
strategies, and is adapted to any number of repetitions of G.
Denition 5.2.1. A strategy of player i is a mapping
i
from H to (A
i
).
We denote by
i
the set of strategies of player i and by =
iN

i
the set
of strategy proles.
The interpretation of a strategy is the following: for each h in H
t
,
i
(h)
is the probability used by player i to select his action of stage t +1 if history
h has been played at the rst t stages.
At every stage, given the past history the lotteries used by the players
are independent, and a strategy prole naturally induces by induction a
probability distribution on the set H (which is countable): rst use ((
i
1
)
i
)
to dene the probability induced by on the actions of stage 1, then use
the transitions given by ((
i
2
)
i
) to dene the probability induced by on the
actions of stages 1 and 2, etc. Using a probability result (e.g. Kolmogorov
extension theorem), this probability can be extended in a unique way to the
set of plays H
(endowed with the product -algebra on A
).
5.2.3 Payos
In a repeated game the players receive a payo at every stage, how should
they evaluate their stream of payos ? There are several possibilities, and
we will consider: nitely repeated games with average payos, innitely re-
peated discounted games, and uniform games (which are innitely repeated
and undiscounted). In the sequel a
t
denotes the random variable of the ac-
tion prole played at stage t.
Denition 5.2.2. The nitely repeated game G
T
.
The average payo of player i up to stage T if the strategy is played is:
i
T
() = E
_
1
T
T
t=1
g
i
(a
t
)
_
.
For T 1, the T-stage repeated game is the game G
T
= (N, (
i
)
iN
, (
i
T
)
iN
).
In the game G
T
it is harmless to restrict the players to use strategies for
the rst T stages only, hence G
T
can be seen as a nite game and by Nashs
theorem its set of Nash equilibrium payos E
T
is non empty and compact.
Notice that considering the sum of the payos instead of the average would
not change the Nash equilibria of G
T
.
Denition 5.2.3. The discounted game G
.
Given in (0, 1], the repeated game with discount factor is G
=
(N, (
i
)
iN
, (
i
)
iN
), where for each strategy prole :
() = E
t=1
(1 )
t1
g
i
(a
t
)
_
.
With this denition, receiving a payo of 1 today is equivalent to
receiving a payo of 1 tomorrow. It is standard to introduce the notation
= 1 [0, 1) (be careful the expression discount factor sometimes
refers to rather than ). One can apply Glicksbergs theorem 1.3.13 and
obtain that the set E
of Nash equilibrium payos of G
is non empty and

compact. Notice that both choosing T = 1 in denition 5.2.2 or = 1 in
denition 5.2.3 leads to the consideration of the stage game G.
The uniform approach directly considers long-term strategic aspects.
Denition 5.2.4. The uniform game G
.
A strategy prole is a uniform equilibrium of G
if:
1) > 0, is a -Nash equilibrium of any nitely repeated game long
enough, i.e.: T
0
, T T
0
, i N,
i

i
,
i
T
(
i
,
i
)
i
T
() +, and
2) ((
i
T
())
iN
)
T
has a limit () in IR
N
when T goes to innity.
() is then called a uniform equilibrium payo of the repeated game, and
the set of uniform equilibrium payos is denoted by E
.
Proposition 5.2.5. For each T 1 and (0, 1], we have E
1
E
T
E
and E
1
E
.
The proof that E
1
E
T
E
is simple and can be made by concatena-

tion of equilibrium strategies. The proof that E
1
E
is more subtle
and can be deduced from the Folk theorem 5.4.1.
5.3 Feasible and individually rational payos
The set of vector payos achievable with correlated strategies in the stage
game is g((A)) = {g(P), P (A)}. Notice that it is nothing than the
convex hull of the vector payos achievable with pure strategies.
Denition 5.3.1. The set of feasible payos is conv g(A) = g((A)).
The set of feasible payos is a bounded polytope, and it represents the
set of payos that can be obtained in any version of the repeated game. It
contains E
T
, E
and E
.
Denition 5.3.2. For each player i in N, the punishment level of player i
is dened as:
v
i
= min
x
i
Q
j=i
(A
j
)
max
x
i
(A
i
)
g
i
(x
i
, x
i
).
v
i
is called the independent minmax of player i. Be careful that in gen-
eral one can not exchange min
x
i
Q
j=i
(A
j
)
and max
x
i
(A
i
)
in the above
expression, see the last game of exercise 6.1.17.
Denition 5.3.3. The set of individually rational payos is:
IR = {u = (u
i
)
iN
, u
i
v
i
i N}.
And the set of feasible and individually rational payos is denoted by:
E = (conv g(A)) IR.
Using the fact that actions are publicly observed after each stage, given
a strategy prole
i
of the players dierent from i, it is easy to construct
a strategy
i
of player i such that: T,
i
T
(
i
,
i
) v
i
. As a consequence
E
, E
T
and E
are always included in E.

We now illustrate the previous denitions on the prisoner dilemma:
C
2
D
2
C
1
D
1
_
(3, 3) (0, 4)
(4, 0) (1, 1)
_
We have v
1
= v
2
= 1, and the set of feasible and individually rational payos
is represented in the following picture:
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
-
6
`
`
`
`
P
P
P
P
P
P
P
P
P
P
P
P
P
P
B
B
B
B
B
B
B
B
B
B
B
B
B
B
E
1 4
1
4
0
P1
P2
5.4 The Folk theorems
These theorems deal with very patient players, and more precisely with:
nitely repeated games with a large number of stages, discounted repeated
games with a low discount factor , or uniform games. The message is
essentially the following : the set of equilibrium payos of the repeated game
is the set of feasible payos (i.e. that one can obtain while playing) and
individually rational (i.e. such that each player has at least his punishment
level). It can be viewed as a everything is possible result, because it implies
that any reasonable payo can be achieved at equilibrium.
5.4.1 The Folk theorems for G
Theorem 5.4.1. The Folk theorem

The set of uniform equilibrium payos is the set of feasible and individu-
ally rational payos: E
= E.
It is dicult to establish who proved the Folk theorem. It has been gen-
erally known in the profession for at least 15 or 20 years, but has not been
published; its authorship is obscure. (R.J. Aumann, 1981)
Proof: We have to show that E E
. Fix u E. u is feasible, hence there

exists a play h = (a
1
, ..., a
t
, ...) such that for each player i,
1
T
T
t=1
g
i
(a
t
)
T
u
i
.
The play h will be called the main path of the strategy, and playing
according to h for some player i at stage t means to play the i-component
of a
t
. For each couple of distinct players (i, j), x x
i,j
in (A
j
) such that
(x
i,j
)
j=i
achieves the minimum in the denition of v
i
. Fix now a player i in
N, we dene his strategy
i
.
i
plays at stage 1 according to the main path, and continues to play
according to h as long as the other players do. If at some stage t 1, for the
rst time some player j does not follow the main path, then
i
plays at all
the following stages the mixed action x
j,i
(if for the rst time several players
leave the main path at the same stage, by convention the smallest player,
according to any xed linear order on N dened in advance, is punished). It
is easy to see that = (
i
)
iN
is a uniform equilibrium of G
with payo u.
Some equilibria constructed as above can be criticized since it might not

be protable for a player to punish forever some other player who rst left
the main path. As in extensive-form games, we dene the notion of subgame-
perfect equilibrium. Given an history h in H and a strategy prole , the
continuation strategy [h] is dened as the strategy prole = (
i
)
iN
,
where i N, h
H:
i
(h
) =
i
(hh
), with the notation that hh
is the
concatenation of history h followed by history h
.
Denition 5.4.1. A subgame-perfect uniform equilibrium is a strategy prole
in such that for each history h in H, [h] is a uniform equilibrium of
G
. We denote by E
the set of payo proles of these equilibria.

Considering h = implies that a subgame-perfect uniform equilibrium
is a uniform equilibrium, hence E
= E. In 1976, Aumann and

Shapley, as well as Rubinstein, proved slights variants of the following result:
introducing subgame perfection changes nothing here.
Theorem 5.4.2. Perfect Folk Theorem
E
= E
= E.
Proof: Here also given a feasible and individually rational payo we have
to construct a subgame-perfect equilibrium. In comparison with the proof
of theorem 5.4.1, it is sucient to modify the punishment phase. If at some
stage t, for the rst time player j leaves the main path, the other players j
will punish player j, but the punishment will not last forever. It will last
until some stage

t, and then whatever has happened during the punishment
phase all players forget everything and come back, as in stage 1, on the main
path. One possibility is to dene

t so that the expected average payo of
player j up to stage

t will be lower than v
j
+ 1/t. Another possibility is to
simply put

t = 2t.
5.4.2 The discounted Folk theorems
Let us now consider discounted evaluations of payos, and come back to the
previous prisoners dilemma with discount factor (0, 1].
We wonder if (3, 3) E
. Playing D
i
for the rst time will increase once
the stage payo of player i of 1 unit, but then at each stage the payo of
this player will be reduced by 2. Hence with the standard Folk theorem
strategies, (3, 3) will be in G
if : 1 2
t=1
(1 )
t
= 2(1 )/, hence if
the players are suciently patient in the sense that 2/3.
In general we have E
= E for each , and the convergence of E
to E when patience becomes extreme ( goes to 0 or goes to 1) is worth

studying. This convergence should be understood for the Hausdor distance
between compact sets of IR
N
, dened by:
d(A, B) = max{sup
aA
inf
bB
d(a, b), sup
bB
inf
aA
d(a, b)}.
Theorem 5.4.3. Discounted Folk theorem (Sorin 1986) Assume there
are 2 players, or that there exists u = (u
i
)
iN
in E such that for each player
i, we have u
i
> v
i
. Then E

0
E.
The following example (due to Forges, Mertens and Neyman) is a counter-
example to the convergence of E
to E:
_
(1, 0, 0) (0, 1, 0)
(0, 1, 0) (1, 0, 1)
_
.
Player 1 chooses a row, player 2 chooses a column and player 3 chooses noth-
ing here. Essentially, this game is a zero-sum game between players 1 and 2,
and in each Nash equilibrium of G
each of these players independently ran-

domizes at each stage between his 2 actions with equal probabilities. There-
fore E
= {(1/2, 1/2, 1/4)} for each , whereas (1/2, 1/2, 1/2) E.

As before one can dene the subgame-perfect equilibria of G
as Nash
equilibria of every subgame of G
, and we denote by E
the (compact) set

of subgame-perfect equilibrium payos.
Theorem 5.4.4. (Perfect discounted Folk theorem, Fudenberg Maskin 1986)
If E has non-empty interior, then E

0
E.
The proofs of the discounted Folk theorems use strict punishments, and
for the subgame-perfect case there are also reward phases to give incitations
for the players to punish someone who deviated. They are omitted here. An
example where the convergence of E
to E fails is the following 2-player

game:
_
(1, 0) (1, 1)
(0, 0) (0, 0)
_
.
In each subgame-perfect equilibrium of the discounted game, player 1 chooses
the Top row at every stage after any history, so E
= {(1, 1)} for each .

Example 5.4.2. Repeated Bertrand competition
Consider an oligopoly with n identical rms, producing the same good
with constant marginal cost c > 0. Each of the rms has to x a selling
price, and the consumers will buy the good to the rm with the lowest price
(in case of several lowest prices, the rms with that price will equally share
the market). We denote by D(p) the number of consumers willing to buy a
unit of good at price p and assume that the demand is always fullled. Each
rm wants to maximize its prot which is (p) = D(p)(p c) if the rm is
alone with the lowest price p, and (p) = 0 if the rm sells nothing. Assume
that has a maximum at some price p > c.
If the game is played once, there is a unique equilibrium price which is
the marginal cost c, and the prots are zero. Let us introduce dynamics
(rms can adapt their price depending on the competition) and consider
the repeated game with discount factor . Dene a strategy prole where
everyone plays p as long as everybody does so, and in case there is a deviation
everyone plays c forever. The payo of a rm if all other rms follow this
strategy is ( p)/n, and the payo of a rm deviating from this strategy by
playing a price p at some stage will be from that stage on: (p) + (1
)0 = (p). Hence if the players are suciently patient in the sense that
1/n, we have a subgame-perfect equilibrium where the realized price is
the collusion (or monopoly) price p.
5.4.3 The nitely repeated Folk theorems
We now consider nitely repeated games. In the prisoners dilemma, we
have seen that for each T, E
T
= {(1, 1)}. Hence the Pareto-optimal equilib-
rium payo (3, 3) can not be approximated by equilibrium payos of nitely
repeated games, and there is no convergence of E
T
to E.
Here again, one denes the subgame-perfect equilibria of G
T
as usual, i.e.
as strategy proles such that t {0, ..., T 1}, h H
t
, [h] is a Nash
equilibrium of the game starting at h, i.e. of the game G
Tt
. We denote by
E
T
the (compact) set of subgame-perfect equilibria of G
T
. We conclude with
two last Folk theorems without proofs.
Theorem 5.4.5. Finitely repeated Folk theorem (Benot &Krishna
1987) Assume that for each player i there exists x in E
1
such that x
i
> v
i
.
Then E
T

T
E.
Theorem 5.4.6. Perfect Finitely repeated perfect Folk theorem (Goss-
ner 1995) Assume that for each player i, there exists x and y in E
1
such that
x
i
> y
i
and that E has a non-empty interior. Then E
T

T
E.
5.5 Extensions: examples
We present here, without proofs, a few examples of general repeated games
(or Markovian Dynamic Games). In each of them, we assume that players
always remember their own past actions, they are assumed to have perfect
recall.
Example 5.5.1. A repeated game with signals
In a repeated game with signals (also called with imperfect observation or
with imperfect monitoring), players do not perfectly observe after each stage
the action prole that has been played, but receive private signals depending
on this action prole. In the following example there are 2 players, and the
sets of signals are given by U
1
= {u, v, w} and U
2
= {}. This means that
after each stage player 1 will receive a signal in {u, v, w} whereas player 2
will always receive the signal , which is equivalent to receiving no signal at
all: the observation of player 2 is said to be trivial. Payos in the stage game
and signals for player 1 are given by:
L R
T
B
_
(0, 0), u (4/5, 1), v
(1/5, 0), w (1, 0), w
_
For example, if at some stage player 1 plays T and player 2 plays R then the
stage payo is (4/5, 1), and before playing the next stage player 1 receives the
signal v (hence he can deduce that player 2 has played R and compute the
payos) whereas player 2 observes nothing (and in particular is not able to
deduce his payo). Here (4/5, 1) is feasible and since the punishment levels
are 0, it is also individually rational. However, one can show that at equilib-
rium it is not possible to play (T, R) a signicant number of times, otherwise,
player 1 could protably deviate by playing B without being punished after-
wards. Formally, one can show here that E
= conv {(1/5, 0), (1, 0)}, and

thus is a strict subset of the set of feasible and individually rational payos
E.
Computing E
for repeated games with signals is not known in general,

even for 2 players
1
.
Example 5.5.2. A stochastic game
Stochastic games have been introduced by Shapley in 1953. In these
dynamic interactions there are several possible states, and to each state is
associated a strategic game. The state may evolve from stage to stage, and
at every stage the game corresponding to the current state is played. The
action prole and the current state determine the current payos and also
the transition to the next state.
The following example is a 2-player zero-sum game called the Big Match.
L R
T
B
_
1 0
0 1
_
The players start at stage 1 by playing the above matrix. They continue
as long as player 1 plays B (and both players observe after each stage the
action played by their opponent). However, if at some stage player 1 plays
T, then the game stops and either at that stage player 2 has played L and
player 1 has a payo of 1 at all future stages, or player 2 has played R at that
1
There is a wide litterature on the topic, see among others Lehrer (1989, 1992), Abreu
Pearce & Stacchetti (1990), Renault Tomala (2004), Mailath Samuelson (2006) and Goss-
ner Tomala (2007).
stage and player 1 has a payo of 0 at all future stages. Formally, there are 3
states here: the initial state, the state where player 1 has payo 0 whatever
happens and the state where player 1 has payo 1 whatever happens. The
last 2 states are said to be absorbing, once an absorbing state is reached the
play stays there forever (they are represented by a in the matrix).
Player 2 can play at each stage the mixed action 1/2 L + 1/2 R, by
doing so he guarantees an expected payo of 1/2 at every stage. Hence
2
, T,
1
,
1
T
(
1
,
2
) 1/2. It is here more dicult and fascinating to
gure out good long-term strategies for player 1. Indeed one can show that:
> 0,
1
, T
0
, T T
0
,
2
,
1
T
(
1
,
2
) 1/2 , and we say that
the Big Match has uniform value 1/2. If the number of repetitions is large
enough, the fair price for this game is 1/2.
The existence of a uniform value for such stochastic games has been
proved by Mertens and Neyman in 1981. The generalization to the exis-
tence of equilibrium payos of 2-player game was completed by Vieille in
2000, and the problem is still open for more than 2 players
2
.
Example 5.5.3. A few repeated games with incomplete information
In a repeated game with incomplete information, there are several states
as well, and to each state is associated a possible stage game. One of the
states is selected once and for all at the beginning of the game, and at each
stage the players will play the game corresponding to this state. The state is
xed and does not evolve from stage to stage. The diculty comes from the
fact that the players imperfectly know it: each player initially receives a signal
depending on the selected state, and consequently may have an incomplete
information on the state and on the knowledge on the other players on the
state.
In the following examples we have 2 players, zero-sum and we assume
that there is lack of information on one side: initially the state k {a, b}
is selected according to the distribution p = (1/2, 1/2), both players 1 and 2
know p and the payo matrices, but only player 1 learns k. Then the game
G
k
is repeated, and after each stage the actions played are publicly observed.
We denote by v
T
the value of the T-stage game with average payos and will
discuss, without proofs, the value of lim
T
v
T
. By theorem 5.5.1 below, this
limit will always exist and coincide with the limit lim
0
v
of the values of
2
A few important references for the non zero-sum case are Sorin (1986b), Flesch Thui-
jsman Vrieze (1997), Solan (1999), Solan and Vieille (2001), and the surveys by Mertens
(2002), Vieille (2002), Solan (2009).
the discounted games.
Example 1. G
a
=
_
0 0
0 1
_
and G
b
=
_
1 0
0 0
_
.
Easy. Player 1 can play T if the state is a and B if the state is b. Hence
v
T
= 0 for each T, and lim
T
v
T
= 0.
Example 2. G
a
=
_
1 0
0 0
_
and G
b
=
_
0 0
0 1
_
.
A naive strategy of player 1 would be to play at stage 1 the action T
if the state is a, and the action B if the state is b. This strategy is called
completely revealing, because player 2 can deduce the selected state from
the action of player 1. It is optimal here in the 1-stage game, but very bad
when the number of repetitions is large. On the contrary, player 1 can play
without using his private information on the state, i.e. use a non revealing
strategy: he can consider the average matrix
1
2
G
a
+
1
2
G
b
=
_
1/2 0
0 1/2
_
,
and play at each stage the corresponding optimal strategy 1/2 T + 1/2 B.
The value of this matrix being 1/4, we have: v
T

1
4
for each T. And one
can show that playing non revealing is here the best that player 1 can do in
the long run: v
T

T
1/4.
Example 3. G
a
=
_
4 0 2
4 0 2
_
and G
b
=
_
0 4 2
0 4 2
_
.
Playing a completely revealing strategy guarantees nothing positive for
player 1, because player 2 can nally play M(iddle) if the state is a, and
L(eft) if the state is b. Playing a non revealing strategy leads to considering
the average game
1
2
G
a
+
1
2
G
b
=
_
2 2 0
2 2 0
_
, and guarantees nothing positive
either.
It is here interesting for player 1 to play the following strategy
1
: initially
he selects once and for all an element s in {T, B} as follows: if k = a, choose
s = T with probability 3/4, and s = B with probability 1/4; and if k = b,
choose s = T with probability 1/4, and s = B with probability 3/4. Then
player 1 plays s at each stage, independently of the moves of player 2.
The conditional probabilities satisfy: P(k = a|s = T) = 3/4, and P(k =
a|s = B) = 1/4. Hence at the end of the rst stage, player 2 having observed
the rst move of player 1 will learn something on the state of nature: his
belief will move from
1
2
a +
1
2
b to
3
4
a +
1
4
b or to
1
4
a +
3
4
b. But still he does not
entirely know the state: there is partial revelation of information. The games
3
4
G
a
+
1
4
G
b
and
1
4
G
a
+
3
4
G
b
both have value 1, and respective optimal strategies
for player 1 are given by T and by B, they correspond to the actions played by
player 1. Hence playing
1
guarantees 1 to player 1: T,
2
,
1
T
(
1
,
2
) 1.
And one can prove that player 1 can not do better here in the long run:
v
T

T
1.
General case of lack of information on one side. The next result is
due to Aumann and Maschler in the sixties (1966, with a reedition in 1995)
and has been generalized in a great number of ways since then
3
. It is valid
for any nite set of states K and any initial probability on K.
Theorem 5.5.1. In a zero-sum repeated game with lack of information on
one side where the initial probability is p and the payo matrices are given
by (G
k
)
kK
, we have
lim
T
v
T
(p) = lim
0
v
(p) = cav u(p),

where u is the mapping from (K) to IR such that u(q) is the value of the
matrix game
k
q
k
G
k
for each probability q = (q
k
)
kK
, and cav u is the
smallest concave function above u.
-
6
u(p)
cav u(p)
1
4
1
2
3
4
1
1
0
p
J
J
J
J
J
J
J
J
J
J
J
J
J
J
J
J
J
J
J
J
J
J
J
J
This picture corresponds to example 3: cav u(1/2) =
1
2
u(1/4)+
1
2
u(3/4) = 1.
3
e.g. for zero-sum games, Mertens Zamir 1971, 1977, Sorin Zamir 1985, de Meyer 1996
a 1996b, Laraki 2001, Rosenberg Solan Vieille 2004, Renault 2006, and for non zero-sum
games Sorin 1983, Hart 1985, Simon et al. 1995, Renault 2000...
Chapter 6
Exercises
6.1 Strategic games
Exercise 6.1.1. Roulette wheels and Condorcet paradox.
There are 2 players and 3 independent roulette wheels A, B and C.
Roulette A selects 1, 6 or 8 with equal probabilities, roulette B selects 3,
5 or 7 with equal probabilities, and roulette C selects 2, 4 and 9 with equal
probabilities. The game is played as follows. First, player 1 chooses one and
only one of the wheels A, B and C. Player 2 observes the choice of player 1
and chooses one of the two remaining wheels. Then both chosen wheels are
run, and each player gets a number from his own roulette. The player with
the highest number wins and receives 1 euro from his opponent.
Imagine you are player 1, how much are you willing to pay to play this
game ? Why is it written Condorcet paradox ?
Dominant strategies
Exercise 6.1.2. (*)
1-An assembly with 99 members has to choose, according to the majority
rule, between 2 projects a and b. Each member i in {1, ..., 99} has a utility
function u
i
: {a, b} {0, 1} which is assumed to be bijective. In order to
vote, each member writes a or b on a voting paper in a sealed envelope.
Then the envelopes are opened and the project receiving the majority of
votes is adopted. Write this interaction as a strategic game. Show that for
each member of the assembly, voting for his favorite project is a dominant
strategy.
89
2- There are now 3 voters and 3 projects a, b, c. Assume that the utilities
are such that u
1
(a) = u
2
(b) = u
3
(c) = 2, u
1
(b) = u
2
(c) = u
3
(b) = 1, and
u
1
(c) = u
2
(a) = u
3
(a) = 0. In order to vote, each member now writes a, b
or c on a voting paper in a sealed envelope. Then the envelopes are opened
and the project with the majority of votes is adopted; if it happens that each
project receives one vote, no project is adopted and the utility of everyone is
assumed to be zero. Is there an equilibrium in dominant strategy ?
Exercise 6.1.3. Second-price auctions with sealed bids (Vickrey auctions)
(**)
There is a single good, and n potential buyers labeled from 1 to n. The
procedure is the following. Each buyer bids by submitting a written oer in
a sealed envelope. All the oers are then transmitted to an auctioneer. The
buyer with the highest bid receives the good and pays a price corresponding
to the highest bid of all other buyers (in case of ex-aequo, by convention the
buyer with the smallest label buys the object at the highest price).
Each buyer i has a valuation v
i
in IR
+
, representing the maximal amount
he is willing to pay for the good. If buyer i eventually buys the good at price
p, his utility is v
i
p. If he does not buy the good, his utility is 0.
Write this situation as a strategic game. Show that for each buyer i,
bidding his valuation v
i
is a dominant strategy.
Exercise 6.1.4. (**) At the Olympic games, a jury with 3 members has to
grade a unique gymnast, and more precisely the jury has to give the gymnast
a single mark (or grade) x in [0, 20]. Put N = {1, 2, 3}. Each jury member
i in N has a valuation v
i
[0, 20], and thinks that the appropriate mark for
the gymnast is precisely v
i
. More precisely, the utility of member i if the
given mark is x is: 20 |v
i
x|. The valuations are v
1
= 9, v
2
= 13 and
v
3
= 14, and commonly known by the members of the jury.
II.A) The grading rule is the following: simultaneously, each member i
gives a personal grade x
i
in [0, 20], and the gymnast receives the nal grade
x =
1
3
(x
1
+ x
2
+ x
3
).
Write this interaction as a strategic game, and compute the pure Nash
equilibria.
II.B) The grading rule is changed: simultaneously, each member i gives
a personal grade x
i
in [0, 20], and the gymnast receives as nal grade the
median of the personal grades, i.e. the unique grade x {x
1
, x
2
, x
3
} such
that both sets {i N, x
i
x} and {i N, x
i
x} have at least 2 elements.
(examples: if x
1
= 7, x
2
= 16 and x
3
= 11, then the median is 11. if x
1
= 7,
x
2
= 16 and x
3
= 7, then the median is 7).
Write this interaction as a strategic game, and show that there exists an
equilibrium in dominant strategy.
Nash equilibria
Exercise 6.1.5. (*)
Compute the Nash equilibria (in pure strategies) of the following 3-player
game (player 1 chooses the row, player 2 chooses the column, player 3 chooses
the matrix, the rst component is the payo for player 1, the second for player
2, the third for player 3).
G D G D
H
B
_
(0, 0, 0) (0, 1, 1)
(2, 2, 2) (1, 3, 4)
_ _
(8, 4, 2) (7, 7, 2)
(9, 2, 5) (10, 1, 0)
_
O E
Exercise 6.1.6. (*)
Compute the Nash equilibria (in pure strategies) of the following 3-player
game (player 1 chooses the row, player 2 chooses the column, player 3 chooses
the matrix, the rst component is the payo for player 1, the second for player
2, the third for player 3).
G D G D
H
B
_
(1, 0, 1) (1, 1, 1)
(2, 0, 3) (5, 1, 4)
_ _
(6, 4, 1) (1, 2, 3)
(8, 0, 2) (0, 0, 0)
_
O E
Exercise 6.1.7. Cournot duopoly: simplest case (*)
Two rms produce the same good with the same marginal cost c ]0, 1[.
The demand of the consumers for a good at price p is D(p) = 1 p. Si-
multaneously each rm i {1, 2} chooses a quantity q
i
0 to produce.
Then the market equalizes the supply and the demand, and the price is set
to p = 1 (q
1
+ q
2
). Each rm i sells the quantity q
i
at the price p. As-
sume that each rm wants to maximize its prot. Write this interaction as
a strategic-form game. Show that there exists a unique Nash equilibrium.
Exercise 6.1.8. (**) Compute the Nash equilibria of the following strategic
games. In each case, G = (N, (A
i
)
iN
, (g
i
)
iN
), where N = {1, 2}, A
1
=
A
2
= [0, 1], the variable x stands for player 1s strategy, and y stands for
player 2s strategy:
a) g
1
(x, y) = 5xy x y + 2; g
2
(x, y) = 5xy 3x 3y + 5.
b) g
1
(x, y) = 2x
2
+ 7y
2
+ 4xy; g
2
(x, y) = (x + y 1)
2
.
c) g
1
(x, y) = 2x
2
+ 7y
2
+ 4xy; g
2
(x, y) = (x y)
2
.
Exercise 6.1.9. (*)
n shermen exploit a lake containing a large quantity of sh. As an
approximation the quantity of sh is assumed to be innite. If each sherman
i shes a quantity x
i
0, the unit price is set to p = max(1
n
i=1
x
i
, 0).
Each sherman sells all his production at price p and tries to maximize his
revenue (production costs are assumed to be zero).
1- Write the revenue of sherman i as a function of (x
1
, ..., x
n
), compute
the Best Response correspondence, the Nash equilibria and the total revenue
at equilibrium.
3- Compare with the monopoly case (n = 1).
Exercise 6.1.10. (**)
Let G = (N, (A
i
)
iN
, (g
i
)
iN
) be a symmetric two-player game, i.e. such
that: N = {1, 2}, A
1
= A
2
denoted by X, and (x
1
, x
2
) XX, g
1
(x
1
, x
2
) =
g
2
(x
2
, x
1
).
Assume that X is a convex and compact subset of IR
n
, that g
1
is continu-
ous and that for each x
2
in X, the mapping (x
1
g
1
(x
1
, x
2
) is quasiconcave.
Show that there exists a symmetric Nash equilibrium, i.e. that there
exists an element x
in X such that (x
, x
) is a Nash equilibrium of G.
Exercise 6.1.11. (*)
a) Let G = (N, (A
i
)
iN
, (g
i
)
iN
) be a strategic game such that:
i N, a
i
A
i
, b
i
A
i
, a
i
A
i
, g
i
(a
i
, a
i
) = g
i
(b
i
, a
i
).
What is the set of Nash equilibria of G ?
b) Can you nd a nite game G = (N, (A
i
)
iN
, (g
i
)
iN
) where: N = {1, 2}
(there are two players), g
1
= g
2
(the players have the same payo functions),
and the set of pure Nash equilibrium payos exactly is {(0, 0), (1, 1), (2, 2)}
?
Exercise 6.1.12. (*)
Fix n 1, and f and l two non-increasing mappings from {1, ..., n} into
the reals. n tourists spend their holidays in a village having 2 beaches X
and Y . Each tourist has to choose one of the beaches to spend the day, the
choices being simultaneous. The utility obtained by a tourist at beach X is
given by f(x), where x is the total number of tourists at beach X. Similarly,
the utility obtained by a tourist at beach Y is given by l(y), where y is the
total number of tourists at beach Y . In this exercise only pure strategies are
considered.
1) Write this interaction as a strategic game.
2) We assume here that f = l. Show that a pure Nash equilibrium exists.
3) Find the Nash equilibria in the following case: n = 100, f(x) = (100
x)
2
and l(y) = 5(100 y) for each x and y in {0, ..., 100}.
Exercise 6.1.13. Guess 1/2 of the average (*)
There are n players, each player i in {1, ..., n} has to choose some number
x
i
in [0, 20]. The payo to player i is then 20|x
i
1
2
x|, where x =
1
n
n
j=1
x
j
is the average of the chosen numbers. Compute the Nash equilibria of this
game.
Exercise 6.1.14. (**)
Two rms A and B are located in the same city and produce the same
good. The constant marginal cost of each rm is set to 1/2. Each rm has
to x a unit price, denoted by p
A
for rm A and p
B
for rm B.
The city is represented by a continuum of consumers. Each consumer buys
one unit of the good from only one of the two rms. The set of consumers
is uniformly distributed in the city, which is linear and represented by the
interval [0, 1].
City
0 1 x
rm A rm B consumer x
c c c
Firm A is located at the western point of the city (x = 0), whereas rm
B is at the eastern point of the city (x = 1).
Each consumer has a transportation cost assumed to be linear. If a
consumer located at x [0, 1] buys the good from rm A, his cost is assumed
to be p
A
+ x/2, and if he buys the good from rm B, his cost is p
B
+
1x
2
.
Each consumer chooses to buy one unit of the good to the rm giving him
the lowest cost.
We consider the game where each rm chooses her price in [1/2, +[,
and sells one unit of the good to each consumer coming to its location. The
number of inhabitants of the city is assumed to be 1 (e.g., the unit may be
the million), and if a proportion of 3/4 inhabitants buys to rm A, the prot
of this rm is 3/4(p
A
1/2) (the prot of rm B being then 1/4(p
B
1/2)).
Each rm wants to maximize its prot. Denote by g
A
and g
B
the payo
function of rm A and B, respectively.
A) Computing the payo functions
A1. Show that if p
B
p
A
> 1/2, then g
A
(p
A
, p
B
) = p
A
1/2 and
g
B
(p
A
, p
B
) = 0. Compute g
A
(p
A
, p
B
) and g
B
(p
A
, p
B
) when p
A
p
B
> 1/2.
A3. Show that if p
B
p
A
[1/2, +1/2],
g
A
(p
A
, p
B
) = (1/2+p
B
p
A
)(p
A
1/2) and g
B
(p
A
, p
B
) = (1/2+p
A
p
B
)(p
B
1/2).
B) Computing the best responses
Fix the price p
B
of rm B. Show that the best response of rm A is to
choose the price p
A
such that p
A
=
1+p
B
2
if p
B
2, p
A
= p
B
1/2 if p
B
2.
C) Compute the Nash equilibria of the game, and the associated equilib-
rium payos.
Exercise 6.1.15. (**) Compute the Nash equilibria of the 2-player game
where A
1
= A
2
= [0, 1), and for each strategy pair (x, y) in A
1
A
2
the
payo for each player is the product x y, i.e. g
1
(x, y) = g
2
(x, y) = x y.
Exercise 6.1.16. Congestion games and Braesss paradox (**)
6 travellers want to drive from O (Origin) to D (Destination), using roads
a, b, c, and d. These roads are oriented as follows, so that there are only 2
possible paths from O to D: the path (a, c) and the path (b, d).
@
@
@
@
@
@R
@
@
@
@
@
@R
d c
b
a
O
D
Each player wants to minimize his travel cost (or equivalently to maximize
the opposite of this cost). The travel cost of a path is dened as the sum
of the travel costs of the roads of this path. The travel cost of a given road
is expressed in minutes and is assumed to be an increasing function of the
number of travellers using this particular road. Denote by x
a
, x
b
, x
c
and x
d
the number of travellers respectively using road a, b, c, d. The travel cost
for road a is dened as t
a
= 10x
a
, which means that covering road a lasts
10x
a
minutes. Similarly, we dene the travel costs of the other roads by:
t
b
= 50 + x
b
, t
c
= 50 + x
c
and t
d
= 10x
d
. Each traveller chooses one of the
possible paths, and we assume that these choices are simultaneous. only pure
strategies are considered.
1) Write this interaction as a strategic game G.
2) Show that at any pure Nash equilibrium, each traveller has a trans-
portation cost of 83 minutes. How many pure Nash equilibria are there ?
3) A new road e is built, so that there are now 3 possible paths from O
to D: (a, c), (b, d) and (a, e, d). Note that road e is oriented, it is not possible
to go rst b then e.
@
@
@
@
@
@R
@
@
@
@
@
@R
-
e
d c
b
a
O
D
The travel cost for road e is set as: t
e
= x
e
+ 10, where x
e
is the number
of travellers using road e. We still denote by x
a
, x
b
, x
c
and x
d
the total
number of travellers respectively using roads a, b, c, d, and still have the
costs: t
a
= 10x
a
, t
b
= 50 + x
b
, t
c
= 50 + x
c
and t
d
= 10x
d
. This denes a
new strategic game G
.
3.a) Consider a strategy prole where both paths (a, c) and (b, d) are
chosen by 3 travellers. Is it a Nash equilibrium of G
?
3.b) Find one pure Nash equilibria of G
(it is not asked to nd all such

Nash equilibria). Compare the transportation costs at this equilibrium with
the equilibrium costs of G.
Mixed strategies
Exercise 6.1.17. (*) Compute the Nash equilibria in mixed strategies of
the following games.
G D
H
B
_
(1, 2) (0, 0)
(0, 0) (2, 1)
_
G D
H
B
_
(6, 6) (2, 7)
(7, 2) (0, 0)
_
G D
H
B
_
(5, 4) (3, 3)
(3, 3) (7, 8)
_
G D
H
B
_
(2, 2) (1, 1)
(3, 3) (4, 4)
_
G D
H
B
_
(1, 0) (2, 1)
(1, 1) (0, 0)
_
g m d
H
M
B
_
_
(1, 1) (0, 0) (8, 0)
(0, 0) (4, 4) (0, 0)
(0, 8) (0, 0) (6, 6)
_
_
G D G D
H
B
_
(1, 1, 1) (0, 0, 0)
(0, 0, 0) (0, 0, 0)
_ _
(0, 0, 0) (0, 0, 0)
(0, 0, 0) (1, 1, 1)
_
O E
Exercise 6.1.18. For each real number t, we consider the following 2-player
game G
t
(where as usual, player 1 chooses the row, player 2 chooses the
column, the rst component is the payo for player 1 and the second is the
payo for player 2) .
L R
T
B
_
(t, t) (0, 0)
(0, 0) (1 t, 1 t)
_
II.a) Show that for each t and each mixed Nash equilibrium payo (u, v)
of G
t
, we have u = v.
II.b) Compute, for each t, the pure Nash equilibria of G
t
.
III.c) Assume that t (0, 1). Compute the mixed Nash equilibria of G
t
.
III.d) Compute for each t the set F(t) of real numbers u such that (u, u)
is a mixed Nash equilibrium payo of G
t
. Draw the graph of the correspon-
dence which associates to each real number t the set F(t).
Exercise 6.1.19. A alternate proof of Nashs theorem. (**)
Notation: if is a real number, we set
+
= max{, 0}.
Let G = (N, (A
i
)
iN
, (g
i
)
iN
) be a nite strategic game, with N =
{1, ..., n}. Dene the mapping f from the set
iN
(A
i
) of mixed strategy
proles to itself which associates to every x = (x
1
, ..., x
n
) in
iN
(A
i
) the
prole y = (y
1
, ..., y
n
) in
iN
(A
i
) such that:
i N, a
i
A
i
, y
i
(a
i
) =
x
i
(a
i
) + (g
i
(a
i
, x
i
) g
i
(x))
+
1 +
b
i
A
i
(g
i
(b
i
, x
i
) g
i
(x))
+
.
Use Brouwers xed point theorem to show that G has a mixed Nash
equilibrium.
Exercise 6.1.20. Feasible payos (**)
Denote by j the complex number e
2i
3
, and by g the mapping from C
3
to C dened by g(a, b, c) = a b c. Consider the 3-player game G, where
A
1
= {1, j}, A
2
= {j, j
2
}, and A
3
= {j
2
, 1}. For (a, b, c) in A
1
A
2
A
3
, the
payo for player 1 is dened as the real part of the complex number g(a, b, c),
the payo of player 2 is dened as the imaginary part of g(a, b, c), and the
payo for player 3 always is zero.
a) Dene the mixed extension of G.
b) Consider the set of feasible payos with mixed strategies, i.e. the set
of vector payos that can be obtained when the players use mixed strategies.
Is it a convex set ? Is it contractible ?
Remark: A metric space X is contractible if there exists a continuous map
f : X [0, 1] to X such that: 1) f(x, 0) = x for all x in X, and 2) f(x, 1) is
independent of x. Intuitively, a contractible space is one that can be contin-
uously shrunk to a point.
Exercise 6.1.21. Value of matrix games (*)
Compute the value of the following zero-sum games.
_
1 2
0 3
_
,
_
3 2
5 0
_
,
_
1 1
2 0
_
, and
_
a b
c d
_
(where a, b, c and d are real parameters).
Same question for the matrix:
_
_
_
_
3 1
0 0
2 4
7 2
_
_
_
_
Exercise 6.1.22. Let G be the zero-sum game given given by the following
matrix:
_
0 2
3 0
_
a) Compute the value and the mixed optimal strategies of G.
b) Consider a decision maker who can invest today in 3 distinct assets i
1
,
i
2
and i
3
, each of unit price 1 (thousand euros) today. The state of the world
tomorrow is unknown, there are 3 possibilities j
1
, j
2
and j
3
. The following
matrix indicates the return of each asset as a function of the state:
j
1
j
2
j
3
i
1
i
2
i
3
_
_
0 2 2
3 0 1
1 1 1
_
_
For instance, if the state tomorrow is j
2
, then one unit of the asset i
1
will be
worth 2 (thousand euros). The asset i
3
is risk-free, its return is 1 whatever
the state tomorrow. The decision maker wants to invest at most 5 thou-
sands euros today, and he can naturally split his investment among the 3
assets. Assuming that there is no depreciation of money between today and
tomorrow, and that the goal of the decision-maker is to maximize his wealth
tomorrow in the worst case, how should the invest today ?
Exercise 6.1.23. Perfect and proper equilibria (**)
Let G = (N, (S
i
)
iN
, (g
i
)
iN
) be a nite strategic game. A mixed strategy
prole = (
i
)
iN
is said to be completely mixed if
i
(s
i
) > 0 i I, s
i
S
i
.
Denition: Fix > 0.
= (
i
)
iN
is a -perfect equilibrium if is completely mixed and if:
i N, s
i
S
i
, (g
i
(s
i
,
i
) < max
t
i
S
i g
i
(t
i
,
i
)) = (
i
(s
i
) ).
= (
i
)
iN
is an -proper equilibrium if is completely mixed and if:
i N, s
i
S
i
, t
i
T
i
, (g
i
(s
i
,
i
) < g
i
(t
i
,
i
)) = (
i
(s
i
)
i
(t
i
)).
is a perfect (resp. proper) equilibrium if there exists a sequence (
t
,
t
)
tIN
such that : t,
t
is a
t
-perfect (resp.
t
-proper) equilibrium,
t

t
0
and
t

t
.
The notion of perfect equilibrium (also called trembling-hand perfect equi-
librium) has been introduced by R. Selten in 1975, the one of proper equi-
librium by R. Myerson in 1978.
1. Existence. Fix in ]0, 1[.
For each player i, dene
i
=
|S
i
|
/|S
i
| and
i
(
i
) = {
i
(S
i
),
i
(s
i
)
i
s
i
S
i
}, and we set () =
iI

i
(
i
). Consider now the correspon-
dence:
F : () ()

iI
F
i
()
where for each i in I and in ,
F
i
() = {
i

i
(
i
), s
i
, t
i
, (g
i
(s
i
,
i
) < g
i
(t
i
,
i
)) =
i
(s
i
)
i
(t
i
)}
1.a) Show that F has non-empty values.
1.b) Apply Kakutanis theorem and conclude on the existence of a proper
and perfect equilibrium .
2. Compute the Nash equilibria, the perfect equilibria and the proper
equilibria of the following 2-player games.
L R
T (1,1) (0,0)
B (0,0) (0,0)
L M R
T (1,1) (0,0) (-9,-10)
M (0,0) (0,0) (-7,-10)
B (-10,-9) (-10,-7) (-10,-10)
Exercise 6.1.24. All-pay auctions.
There is an auction with 2 bidders (the players) and one good. Both
bidders have the same valuation for the good, lets say v > 0. Each player i
submits a bid b
i
0 in a sealed envelope, the highest bid obtains the good
(in case of equal bids the player who obtains the good is selected with equal
probability) but both players pay their bids. Hence the utility of a player i
is v b
i
if he obtains the good, and b
i
otherwise.
Write this interaction as a strategic game. Compute the set of pure Nash
equilibria. Show that a mixed Nash equilibrium of the game is given by each
player choosing his bid with the uniform distribution on [0, v]. What is the
expected sum of bids ?
Extension to n players ?
6.2 Extensive-form games
Games with perfect information
Exercise 6.2.1. Let G be the following extensive-form game:
@
@
@
@
@
@
@
@
@
@
@
@
P1
P2
(3, 1) (1, 0)
(0, 2)
T B
L R
c
c
A) Write the associated strategic form.
B) Compute the pure Nash equilibria of G.
C) Compute the mixed Nash equilibria of G. What is the set of payos
associated to these equilibria ?
D) Compute the subgame-perfect equilibria of G.
Exercise 6.2.2. Let G be the following extensive-form game:
@
@
@
@
@
@
@
@
@
@
@
@
P1
P2
(3, 1) (1, 1)
(0, 2)
T B
L R
c
c
A) Write the associated strategic form.
B) Compute the pure Nash equilibria of G.
C) Compute the mixed Nash equilibria of G. What is the set of payos
associated to these equilibria ?
D) Compute the subgame-perfect equilibria of G.
Exercise 6.2.3.
@
@
@
@
@
@
A
A
A
A
A
A
@
@
@
@
@
@
A
A
A
A
A
A
B
B
B
B
B
B
A
A
A
A
A
A
J
J
J
J
J
J
P1
P2
P3 P3
r
x
1
x
2
x
3
x
4
x
5
x
6
(1, 0, 0)
_
_
2
0
0
_
_
_
_
0
0
0
_
_
_
_
3
1
1
_
_
_
_
1
2
0
_
_
_
_
2
0
0
_
_
_
_
0
0
2
_
_
_
_
2
2
0
_
_
_
_
2
1
2
_
_
b b a a
E
W D
R L R L R
c
c c
a) Compute the Nash equilibria, resp. the subgame-perfect equilibria,

resp. the Bayesian-perfect equilibria, resp. the sequential equilibria, in pure strategies.
b) Compute the sequential equilibria in behavior strategies.
Exercise 6.2.4. Chomp !
Given two positive integers n and m, we dene the zero-sum game G(n, m)
as follows:
Let P(n, m) be the set of points in the plane IR
2
with rst coordinate in
the set {0, ..., n} and second coordinate in the set {0, ..., m}. At the beginning
of the game, a stone is placed at each of these points. Player 1 plays rst.
He chooses a stone, removes it as well as all stones with both coordinates
not lower than the chosen stone. Then Player 2 plays according to the same
rule, he chooses one of the remaining stones, removes it as well as all stones
with both coordinates not lower than the chosen stone...The game continues
with players alternating choices. The player removing the last stone looses.
1- Show that in the game G(n, m), player 1 has a winning strategy (it is
not asked to exhibit this strategy).
2- Find a winning strategy in the game G(n, n).
3- The game G(, ) is similarly dened by rst placing a stone at
every point in IR
2
with non negative coordinates. What about the value of
G(, ) ?
Exercise 6.2.5. Rubinsteins bargaining (nite horizon)
Two players have to split a cake with initial size normalized to 1.
Static game Player 1 starts by proposing to player 2 a division (x, 1x) with
x [0, 1] (with the interpretation that x would be the quantity for player
1, and 1 x the quantity for player 2). Then player 2 can accept or refuse,
and the game is over. The payos are (x, 1 x) if player 2 accepts and (0, 0)
otherwise. Show that for each x, there exists a NE with payo (x, 1x).Show
that there exists a unique SPE, and this equilibrium has payo (1, 0).
Dynamic game The players can now alternate oers. After each stage, the
size of the cake is multiplied by a xed discount factor 0 < < 1. The game
T
has T stages, with T even, and is dened as follows:
Stage 1: Player 1 proposes (x
1
, 1 x
1
). Player 2 can accept (then the
game is over and the payos are (x
1
, 1 x
1
)) or refuse and the play goes to
stage 2.
2
, 1 x
2
2
, (1 x
2
))) or refuse and the play goes
to stage 3.
Stage t, with t odd: Player 1 proposes (x
t
, 1 x
t
). Player 2 can accept
(then the game is over and the payos are (
t1
x
t
,
t1
(1 x
t
))) or refuse
and the play goes to stage t + 1.
Stage t, with t even: Player 2 proposes (x
t
, 1 x
t
). Player 1 can accept
(then the game is over and the payos are (
t1
x
t
,
t1
(1 x
t
))) or refuse
and the play goes to stage t + 1.
If all the oers are rejected, the payo after stage T is dened as (0, 0).
The goal is to nd the SPE payos of
T
by backwards induction.
1- Consider the auxiliary game
2
(x, y), with x 0, y 0, x + y 1,
dened as follows.
1
, 1 x
1
1
, 1 x
1
)) or refuse and the play goes to
stage 2.
2
, 1x
2
). Player 1 can accept or refuse, and
in both cases the game is over. The payos are (x
2
, (1 x
2
)) if he accepts
and (
2
x,
2
y) if he refuses.
Find the SPE of
2
(x, y) by backwards induction, and check that there
is a unique SPE payo, which will be denoted by F(x, y).
2- Show that for each T, there is a unique SPE payo of
T
, denoted by
U
T
IR
2
. Prove that for T 2, we have U
T
= F(U
T2
). Compute U
T
for
each T, and give its limit as T goes to innity.
Exercise 6.2.6. Rubinsteins bargaining (innite horizon ***)
Similar as exercise 6.2.5, but now there may be innitely many stages.
(the players alternate oers, and after each refused oer the size of the cake
is multiplied by ). The denition of subgame-perfect equilibrium extends
to this perfect information game with innitely many stages.
Show that the following strategy pair is a subgame-perfect equilibrium :
1) the player making an oer proposes to keep a fraction
1
1+
of the current
pie (and consequently, to give

1+
of the current pie to the other player), 2)
the other player accepts this oer and any better oer, and refuses anything
strictly lower than that.
Show that this is the unique subgame perfect equilibrium of the game.
Exercise 6.2.7. Centipede Game. Compute the SPE.
P1 P2 P1 P2 P1 P2 P1 P2
r1 r2 r3 r4 r5 r6 r7 r8
d1 d2 d3 d4 d5 d6 d7 d8
(1, 0) (0, 2) (3, 1) (2, 4) (5, 3) (4, 6) (7, 5) (6, 8)
(7, 7) c c c c c c c c
Games with imperfect information
Exercise 6.2.8. A simple Poker
2 players play the following zero-sum game. The players rst bet 1 euro,
i.e. put 1 euro each at the center of the table.
Then a 52 cards game is uniformly shued, one card is randomly se-
lected, and only player 1 observes the card.
Player 1 can either quit the game (in this case the game is over and
player 1 has lost 1 euro), or add 1 extra euro at the center of the table.
if player 1 did not quit, player 2 has the choice between quitting the
game (in this case the game is over and player 2 has lost 1 euro), or adding
himself 1 extra euro at the center of the table. In this last case, the selected
card is shown: if it is red, Player 1 wins the money put on the table, other-
wise the card is black and Player 2 wins the money put on the table.
Write this interaction as an extensive-form game, and write an associated
strategic-form game. What is the fair price that player 1 has to pay to player
2 for playing this game ?
What are the optimal strategies for the players, how often should player
1 blu ?
Exercise 6.2.9. Strategic information transmission: an example
There are 2 players, a state of nature k = 1, 2 is uniformly selected, and
announced to player 1 only. Then player 1 sends a message m {A, B} to
player 2. Finally player 2 chooses an action s {L, M, R}. The payos only
depend on k and s, and are dened as follows:
State k = 1: L (0,6); M (2,5); R (0,0).
State k = 2: L (0,0); M (2,5); R (2,12).
1- Write the extensive form and precise the pure strategy sets.
2- Compute the Nash equilibria in pure strategies (depending on whether
Player 1 always sends the same message regardless of Natures choice, or
whether Player 1 sends A sometimes and B other times). Compute the
equilibrium payos, what is the best equilibrium for player 1 ?
3- Show that the following strategy prole is a Nash equilibrium. Player
1 : send A if k = 1 and (A with probability 1/2; B with probability 1/2) if
k = 2; Player 2 : send M if A and R if B.
Compute the payo of this equilibrium. How should player 1 reveal the
information (fully, not at all, partially)?
Exercise 6.2.10. Negative value of information: an example
A) Consider the following 2-player game: a mediator tosses a coin and
chooses Heads or Tails with equal probabilities. No player observes the coin.
Then player 1 has to announce publicly Heads or Tails. Finally player 2 also
has to announce Heads or Tails, and the game is over. The coin is shown,
and the payos are computed as follows: a player with a bad guess (choosing
Heads when the coin is Tails, or vice-versa) has a payo of 0, a player with a
good guess has a payo of 2 if the other player also made a good guess, and
has a payo of 6 otherwise.
Write this interaction as an extensive form game, and write an associated
strategic form game. What is the set of Nash equilibrium payos ?
B) The rules of the game are changed: the selected coin is shown to player
1, not to player 2, before the players make their announcements. Interpreta-
tion ?
Exercise 6.2.11.
H
H
H
H
H
H
H
H
H
H
H
H
H
J
J
J
J
J
J
J
J
J
J
J
J
P1
P2
(0, 0) (2, 1) (0, 0) (2, 1)
(1, 10)
r
x
1
x
2
L M R
a a b b
c
c c
a) Compute the Nash equilibria, the SPE and the BPE in pure strategies.
b) Same question in behavior strategies.
Exercise 6.2.12.
@
@
@
@
@
@
J
J
J
J
J
J
J
J
J
J
J
J
P1
P2
P3
r
x
1
x
2
x
3
(2, 2, 2) (0, 1, 0) (2, 0, 0) (2, 0, 0)
(1, 0, 0)
a b
E
D
L L R R
c
c
c c
a) Compute the Nash equilibria, the SPE and the BPE in pure strategies.
b) Same question in behavior strategies.
Exercise 6.2.13. Compute the Nash equilibria, the SPE, the BPE and the
sequential equilibria in pure strategies.
@
@
@
@
@
@
A
A
A
A
A
A
@
@
@
@
@
@
A
A
A
A
A
A
B
B
B
B
B
B
A
A
A
A
A
A
J
J
J
J
J
J
P1
P2
P3
r
x
1
x
2
x
3
x
4
x
5
x
6
(1, 0, 0)
_
_
2
2
2
_
_
_
_
0
1
0
_
_
_
_
2
0
0
_
_
_
_
2
0
0
_
_
_
_
2
2
2
_
_
_
_
0
1
0
_
_
_
_
2
0
0
_
_
_
_
2
0
0
_
_
b b a a
E
W D
R L R L R L L R
c
c c
Exercise 6.2.14. Let be the following extensive-form game.
H
H
H
H
H
H
H
H
H
H
H
H
H
J
J
J
J
J
J
J
J
J
J
J
J
P1
P2
(2, 3) (1, 0) (1, 0) (2, 3)
(1, 0)
(0, 2) (0, 2)
r
x
1
x
2
L M R
a b a c c b
c
c c

a) Compute the Nash equilibria of in pure strategies.
b) Compute the SPE, the BPE and the sequential equilibria of in pure
strategies.
c) Compute the Nash equilibria of in mixed strategies.
d) Compute the SPE, the BPE and the sequential equilibria of in be-
havior strategies.
Exercise 6.2.15. An interesting guarantee.
A) A seller (player 1) negotiates the selling of a car to a buyer (player 2).
The car may either have high quality (type H) or low quality (type L). The
seller is a professional and knows the type of the car. The buyer thinks that
both types are equally likely.
The valuation of the seller for a high quality car is v
H
1
= 16 (e.g., thou-
sands of euros), and for a low quality car it is v
L
1
= 6. For the buyer, the
valuations are v
H
2
= 24 and v
L
2
= 14.
The seller has to propose a price p which can only be either 10 or 20
(only these 2 values are possible). Then the buyer can either accept the
transaction at this price (and the utility of the seller is p minus his valuation
for the car, whereas the utility of the buyer is his valuation minus p), or
refuse the transaction (in this case, all utilities are 0).
A.1) Write this interaction as an extensive-form game.
A.2) Write the associated strategic form, and give (without justication)
the pure Nash equilibria.
B) The game of A) is modied as follows. The seller has the possibility
to include a guarantee in his proposition. He now has 4 possible oers: 20
G
(price of 20 with guarantee), 10
G
(price of 10 with guarantee), 20 (price of 20
without guarantee) and 10 (price of 10 without guarantee). Then the buyer
can accept or refuse the transaction.
When a transaction with guarantee is accepted, the buyer has to pay an
extra amount of 2 to the seller, but if the car has a mechanical problem during
the following year the seller will have to give him back the extra amount of
13. To simplify we assume that a high quality car will not have a problem,
whereas a low quality car will have one for sure.
Overall, the situation is now represented by the extensive-form game
represented next page (it is not asked to write the strategic form associated
to ).
Denote by
1
= (20
H
G
, 10
L
) the pure strategy of the seller who proposes
20
G
if the car has high quality and 10 if it has low quality. Denote by
2
= (R
20
, A
10
, A
20
G
, A
10
G
) the pure strategy of the buyer who refuses the
proposal of 20 (without guarantee) but accepts all other oers.
B.1) Show that (
1
,
2
) is a Nash equilibrium of .
B.2) Is (
1
,
2
) a Subgame-Perfect equilibrium of ? a Bayesian-Perfect
equilibrium of ?
B.3) Compute all the Bayesian-Perfect equilibria, and all the sequential
equilibria of .
Exercise 6.2.16. A jury.
There is a jury with an odd number of members (players), and a suspected
man to be declared innocent or guilty. The set of players is written N =
{1, ..., 2n +1}, with n a non negative integer. There is a true state of nature
k {0, 1} representing whether the suspect is innocent (k = 0) or guilty
(k = 1), which is a priori equally likely: P(k = 0) = P(k = 1) = 1/2.
Conditionnal on the true state k, each member i receives an independent
signal k
i
{0, 1} such that:
P(k
i
= 0|k = 0) = 1, P(k
i
= 1|k = 1) = 1/2, P(k
i
= 0|k = 1) = 1/2.
The event k
i
= 1 may be interpreted as player i has found an evidence that
the suspect is guilty. If the suspect is innocent, for sure player i can not
nd such evidence so k
i
= 0 if k = 0. And if the suspect is guilty, player
i has probability 1/2 to nd such an evidence. By independence we have
P(k
1
= x
1
, k
2
= x
2
, ..., k
2n+1
= x
2n+1
|k = 1) = (1/2)
2n+1
for all values of
x
1
,... x
2n+1
in {0, 1}. The signals are assumed to be private, i.e. the signal
of a player is not observed by the other players.
After receiving their signal, the players simultaneously vote and the ma-
jority rule is applied: each player i chooses a
i
{0, 1}, and the suspect is
declared guilty if
iN
a
i
n + 1, and he is declared innocent otherwise.
Then, all players receive the same payo, given by: +1 if the suspect is in-
nocent and declared innocent, +1 if the suspect is guilty and declared guilty,
-1 if the suspect is guilty and declared innocent, and L if the suspect is
innocent and declared guilty, where L 1 is an exogeneous parameter.
This strategic interaction can be represented as a nite extensive form
game , where nature rst chooses (k, k
1
, ..., k
2n+1
) and privately informs
each player i of k
i
. It is not asked to represent this extensive-form game. In
, a pure strategy of a player i is a mapping
i
: {0, 1} {0, 1} associating
to his signal k
i
his vote
i
(k
i
). By denition, the sincere voting strategy of
player i is the identity
i
such that
i
(0) = 0 and
i
(1) = 1.
I.a) How many information sets has each player ? Show that in , each
Nash equilibrium in behavior strategies is also a sequential equilibrium.
I.b) We assume here that there is a unique jury member: n = 0. Compute
P(k = 0|k
1
= 0), and prove that there is a unique Nash equilibrium in
behavior strategies. At equilibrium, what is the probability for an innocent
suspect to be declared innocent ? At equilibrium, what is the probability for
a guilty suspect to be declared guilty ?
I.c) We assume here that there are at least 3 jury members. Prove that
there exists a pure Nash equilibrium where a guilty suspect is always declared
innocent. Prove that there exists a pure Nash equilibrium where an innocent
suspect is always declared guilty.
In the following questions I.d) and I.e), we assume that there are exactly
3 jury members: n = 1
I.d) We wonder whether sincere voting for everyone, i.e. the strategy
prole (
1
,
2
,
3
), is a Nash equilibrium of .
I.d.1) Compute the expression:
P(k = 0)P(k
1
= 0, k
2
= 0, k
3
= 0|k = 0)
P(k
1
= 0)
,
and deduce the value of the probability P(k = 0, k
2
= 0, k
3
= 0|k
1
= 0).
I.d.2) Is sincere voting for everyone a Nash equilibrium ?
I.e) Prove that there exists a pure Nash equilibrium where an an inno-
cent suspect is declared innocent with probability 1, and a guilty suspect is
condemned with probability 3/4.
Exercise 6.2.17. Consider the strategic game G = (N, (A
i
)
iN
, (g
i
)
iN
)
represented below:
l m r
T
M
B
_
_
(1, 0) (0, 1) (0, 0)
(0, 1) (0, 0) (1, 0)
(0, 0) (1, 0) (0, 1)
_
_
1) Compute the mixed Nash equilibria of G.
2) Assume that G is played twice, and after stage 1 the sum of the payos
of stage 1 are publicly announced, but the action prole of stage 1 is not
announced. More precisely, we dene a game with perfect recall played as
follows:
- at stage 1, each player i chooses an action in his own action set. The
choices are simultaneous. Denote by a
1
= (a
1
1
, a
2
1
) the joint action prole
played at stage 1. At the end of stage 1, a
1
is not announced to the players
but the sum s
1
= g
1
(a
1
) + g
2
(a
1
) is publicly announced to the players. In
some cases, the players can not deduce from their own action and s
1
the
payo they got at stage 1.
- at stage 2, again each player i simultaneously chooses an action in his
own action set. Denote by a
2
= (a
1
2
, a
2
2
) the action prole played at stage 2.
Then the game is over, and each player i receives the total payo g
i
(a
1
) +
g
i
(a
2
). This ends the description of .
Show that (7/9, 7/9) is a mixed Nash equilibrium payo of .
6.3 Bayesian games and games with incom-
plete information
Exercise 6.3.1. First-price auctions with sealed bids.
There is a single good, and 2 potential buyers. The procedure is the
following. Each buyer bids by submitting a written oer in a sealed envelope.
All the oers are then transmitted to an auctioneer. The buyer with the
highest bid receives the good (in case of ex-aequo, the winner is selected
with probability (1/2,1/2)), and pays the price he oered.
Each buyer i {1, 2} has a valuation v
i
in [0, 1], the maximal amount he
is willing to pay for the good. If buyer i eventually buys the good at price
p, his utility is v
i
p. If he does not buy the good, his utility is 0. Each
buyer knows his valuation, and has a uniform belief over the interval [0, 1]
regarding the valuation of the other player.
1. Write this interaction as a Bayesian game.
2. Show that there exists a unique pure strategy equilibrium (
1
,
2
)
which is symetric (
1
=
2
= f), with f increasing and dierentiable.
Exercise 6.3.2. Double auction
A seller (player 1) and a buyer (player 2) negotiate the sale of an indi-
visible good. The cost for the seller is c, and the valuation for the buyer is
v. v and c are independent, and both chosen from the uniform distribution
on [0, 1]. The seller and the buyer simultaneously select oers b
1
and b
2
. If
b
1
> b
2
, there is no sale. Otherwise, the good is sold, and the price is set to
be the average (b
1
+ b
2
)/2.
1- Assume there is complete information, i.e. that v and c are both known
by the players, and that v > c. Show that there exists a continuum of equi-
libria in pure strategies.
2- Assume that there is incomplete information: each player only knows
his own valuation. We look for a pure strategy equilibrium s
1
(.), s
2
(.). Show
that s
1
and s
2
are non-decreasing. Find a pure equilibrium where s
1
and s
2
are ane increasing functions. When is the good sold ?
6.4 Correlated equlibrium
Exercise 6.4.1. We consider the following 2-player game.
L R
T
B
_
(4, 1) (0, 0)
(3, 3) (1, 4)
_
1) Compute the pure and mixed Nash equilibria.
2) Write the equations characterizing the correlated equilibrium distribu-
tions. Find the correlated equilibrium distribution maximizing the sum of
the payos of the players.
Exercise 6.4.2.
L R
T
B
_
(2, 0) (0, 1)
(0, 1) (1, 0)
_
Compute the mixed Nash equilibria and the correlated equilibrium dis-
tributions, as well as the associated payos.
Exercise 6.4.3.
L R L R
T
B
_
(1, 1, 1) (0, 0, 0)
(0, 0, 0) (0, 0, 0)
_ _
(0, 0, 0) (0, 0, 0)
(0, 0, 0) (1, 1, 1)
_
W E
Compute the mixed Nash equilibria and the correlated equilibrium distribu-
tions, as well as the associated payos.
Same questions for the minority game:
L R L R
T
B
_
(0, 0, 0) (0, 1, 0)
(1, 0, 0) (0, 0, 1)
_ _
(0, 0, 1) (1, 0, 0)
(0, 1, 0) (0, 0, 0)
_
W E
Exercise 6.4.4. Let H = (N, (A
i
)
iN
, (g
i
)
iN
) be a nite strategic game, we
denote by G the mixed extension of H.
1) Fix a player i in N, and let x
i
(A
i
) be a particular mixed strategy
of player i which is strictly dominated in G. Show that x
i
is never a best
reply of player i, i.e. show that for any mixed action prole of the other
players x
i
= (x
j
)
j=i
, with x
j
(A
j
) for each j, then x
i
is not a best reply
of player i against x
i
.
2) Consider the particular case of 3 players: N = {1, 2, 3}, with A
1
=
{T, B}, A
2
= {L, R} and A
3
= {W, M, E} and the payos of player 3 given
as follows:
L R L R L R
T
B
_
1 0
0 0
_ _
0 0
0 1
_ _
0.3 0.3
0.3 0.3
_
W M E
Consider x
3
=
1
2
W +
1
2
M which plays W and M with equal probabilities.
Is x
3
strictly dominated in G ? Can you nd x
3
= (x
1
, x
2
) in (A
1
)(A
2
)
such that x
3
3
?
3) We come back to the general case, and x a player i in N, and let
x
i
(A
i
) be a particular mixed strategy of player i. Show that x
i
is
strictly dominated in G if and only if it is impossible to nd a correlated
strategy x
i
in (
j=i
A
j
) such that x
i
is a best response of player i against
x
i
.
Exercise 6.4.5. Peleg (***) We consider the following strategic game with
an innite number of players: G = (N, (A
i
)
iN
, (g
i
)
iN
), where N is the set
of positive integers, for each player i the set of actions is A
i
= {0, 1} and the
payo is given by:
g
i
(a
1
, ..., a
n
, ...) =
_
a
i
if
jN
a
j
= +
a
i
if
jN
a
j
< +
Show that there is no Nash equilibrium in pure strategies. Show that
there is no Nash equilibrium in mixed strategies (hint: use Borel-Cantelli
lemma from probability theory).
Dene the probabilities
1
and
2
over A =
iN
A
i
(endowed with the
product -algebra) as follows:
1
selects independently, for each i in N,
a
i
= 1 with probability 1/i and a
i
= 0 with probability 1 1/i.
2
selects
a positive integer j with probabiliity
1
j(j+1)
, and chooses a
i
= 1 for each
i j, and a
j
= 0 for each i > j. Show that =
1
2
1
+
1
2
2
is a correlated
equilibrium distribution. What is the payo to each player ?
6.5 Repeated Games
Exercise 6.5.1. Compute the set of feasible and individually rational payos
in the following games:
a)
L R
T
B
_
(1, 1) (3, 0)
(0, 3) (0, 0)
_
b)
L R
T
B
_
(1, 0) (0, 1)
(0, 0) (1, 1)
_
c)
L R
T
B
_
(1, 1) (1, 1)
(1, 1) (1, 1)
_
d)
L R
T
B
_
(2, 1) (0, 0)
(0, 0) (1, 2)
_
e)
L R L R
T
B
_
(0, 0, 0) (0, 1, 0)
(1, 0, 0) (0, 0, 1)
_ _
(0, 0, 1) (1, 0, 0)
(0, 1, 0) (0, 0, 0)
_
W E
Exercise 6.5.2. Let G be a nite 2-player zero-sum game. What are the
set of equilibrium payos of the nitely repeated game G
T
, of the discounted
game G
and of the uniform game G
?
Exercise 6.5.3. Let G be a nite 2-player game such that E
1
= {(v
1
, v
2
)},
i.e. there is a unique Nash equilibrium payo of G where both players receive
their independent minmax. Show that E
T
= {(v
1
, v
2
)} for each T.
Exercise 6.5.4. Find a nite game G where the vector of independent min-
max v = (v
i
)
iN
is not feasible.
Exercise 6.5.5. Consider a prisoners dilemma played oine (or in the
dark): the players do not observe the actions of the other player between
the stages.
C
2
D
2
C
1
D
1
_
(3, 3) (0, 4)
(4, 0) (1, 1)
_
Compute the set of equilibrium payos E
of the -discounted repeated game

( [0, 1)).
Exercise 6.5.6. Consider the battle of sexes played oine (in the dark): the
players do not observe the actions of the other player between the stages.
C
2
D
2
C
1
D
1
_
(2, 1) (0, 0)
(0, 0) (1, 2)
_
Compute the set of equilibrium payos
T1
E
T
, where E
T
is the set of Nash
equilibrium payos of the T-stage repeated game with average payos.

Game Theory

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Game Theory

Загружено:

Авторское право:

Доступные форматы

BASICS OF GAME THEORY

Jerome Renault, M2 Ecomath TSE

Equivalently, a dominant strategy of player i is a strategy of this player

obtained from G by removing the strategy b

is the game (N, (A

be the game obtained from G when the strategy b

are also available in G, hence a

. Then some strategies may become

, and one can start again and eliminate a nite num-

may be removed etc. At each step, by

. One can remove g

Given x in (S), x(s) is the probability of s under x, i.e. the probability

J.Renault Game Theory TSE 2013 25

be the game issued from G

J, and this means that (i, j) is a Nash equilibrium of G.

Proof: By the theorem 1.4.5 of Nash, the mixed extension

on the set of plays . Formally, P

really is a probability, i.e. that P

() = 1. The expected payo for player i when is played is

Consider the pure strategy

(u) the probability that the game goes through u if is

(u) > 0, it is natural to assume that the belief of player i at u is given

(u) > 0, is sequentially rational at u with respect to P

(u) > 0, is sequentially

(v) > 0 and g

(.|v). We will show that the strategy

(u) > 0. We have for any x in u, P

(.|u), and is compatible with .

(u) > 0. Then the

G is the extensive form game where: - rst, nature chooses t = (t

is the element of (A) such that for each a

70 Game Theory TSE 2013 J. Renault

Here (0, 1) is a communication equilibrium payo, but not an extensive-

(identical to the cartesian product A

(endowed with the product -algebra on A

of Nash equilibrium payos of G

is non empty and

is simple and can be made by concatena-

are always included in E.

Theorem 5.4.1. The Folk theorem

. Fix u E. u is feasible, hence there

Some equilibria constructed as above can be criticized since it might not

), with the notation that hh

the set of payo proles of these equilibria.

= E. In 1976, Aumann and

= E for each , and the convergence of E

to E when patience becomes extreme ( goes to 0 or goes to 1) is worth

each of these players independently ran-

= {(1/2, 1/2, 1/4)} for each , whereas (1/2, 1/2, 1/2) E.

the (compact) set

to E fails is the following 2-player

= {(1, 1)} for each .

= conv {(1/5, 0), (1, 0)}, and

for repeated games with signals is not known in general,

(p) = cav u(p),

(it is not asked to nd all such

a) Compute the Nash equilibria, resp. the subgame-perfect equilibria,

Exercise 6.2.14. Let be the following extensive-form game.

J.Renault Game Theory TSE 2013 109

and of the uniform game G

of the -discounted repeated game

Вам также может понравиться