Notes On Stochastic Processes: 1 Learning Outcomes

Notes on Stochastic Processes 1
Contents
1 Learning Outcomes 1
2 Introduction to stochastic processes 1
3 Discrete time discrete state space stochastic processes: 3

3.1 First return probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4 Restricted Random walk models: 16
5 Period of a state 19
6 Stationary Distribution 21
7 References 26
1 Learning Outcomes
At the end of this chapter a Student is expected to
1. learn stochastic processes and its classification with state space and time.
2. derive the transition probability matrix.
3. find the communication classes.
4. find behavior of the states of a Markov chain.
5. find the limiting distribution, if it exists.
2 Introduction to stochastic processes
A process is deterministic if its future is completely determined by its present and past. For example
let us consider the following processes.
1
This lecture notes is prepared by Biman Chakraborty for the Course Stochastic Processes, 5-yr Integrated B. Sc.
-M.Sc. in Statistics and Informatics (Sem-VIII), 2-yr M. Sc. in Statistics (Sem-II).
1
1. Consider the difference equation
Fn = Fn−1 + Fn−2 for n > 2 (2.1)
Then, if F1 = F2 = 1, this is the well known Fibonacci sequence and {Fn } = {1, 2, 3, 5, 8, ...}.
2. Let us consider the growth governed by the following equation.
dX
= rX (2.2)
dt
The solution of the above equation is given by X(t) = X0 ert . If X0 and r are given then we get
a specific size (X(t)) at time t.
For the above two processes future is completely determined by present and past. On the other
hand, a stochastic process is a random process evolving in time. Informally, this means that even if
you have full knowledge of the state of the system and its entire past, you can not be sure of its value
at future times with certainty. For example, we consider the following processes.
Starting from origin a man moves one step to the right (left) if he gets head (tail) after tossing a
coin. What will be the distance from origin at time t=10? If we calculate the distance we may get
any value between −10 and +10. This process is also called random walk on integers.
Definition A stochastic process, or often random process, is a collection of random variables {Xt , t ∈
T }, representing the evolution of some system of random values over time. This is the probabilistic
counterpart to a deterministic process (or deterministic system). If T is continuous (discrete) then we
call the stochastic process Xt is a continuous (discrete) time stochastic process.
Instead of describing a process which can only evolve in one way (as in the case, for example,
of solutions of an ordinary differential equation), in a stochastic or random process there is some
indeterminacy: even if the initial condition (or starting point) is known, there are several (often
infinitely many) directions in which the process may evolve.
Definition The set of all possible values of Xt is called State Space. The elements of this set are
called State. The State Space may be discrete as well as continuous. If the State Space is discrete
(continuous) then the stochastic process is called discrete (continuous) state space stochastic process. If
time is discrete (continuous) then the stochastic process is called discrete (continuous) time stochastic
process.
2
So according time and state space stochastic processes are of the following four types.
1. Discrete time and discrete state space (DTDS) stochastic processes: For these processes we have
time as well as state space is discrete.
Example : Random walk on the set of integers as described above.
2. Discrete time and continuous state space (DTCS) stochastic processes: Here time is discrete but
state space is continuous.
Example: Autoregressive processes example of Discrete time and continuous state space Stochas-
tic Processes.
3. Continuous time and discrete state space (CTDS) stochastic processes: Here time is continuous
but state space is discrete.
Example: Population size of any species over time.
4. Continuous time and Continuous state space (CTCS) stochastic processes: For these processes
time as well as state space is continuous.
Example: Brownion Motion
3 Discrete time discrete state space stochastic processes:
Definition A discrete time stochastic process Xn is called a discrete time Markov chain, if the fol-
lowing holds for all choices of n ≥ 0 and any set of states i0 , i1 , i2 , ..., in+1 in the state space S:
P [Xn+1 = in+1 |X0 = i0 , ..., Xn = in ] = P [Xn+1 = in+1 |Xn = in ] (3.1)
The above definition says that the probabilities associated with future state only depends upon
the current state, and not on the history of the process.
Example 3.1. Random Walk on Integers: A random walk is a model used to describe the motion
of an entity, the walker, on some discrete space. In this process the person starting from some origin
he moves 1 unit unit to the right with probability p and 1 unit to the left with probability (1-p). If
1
p= 2 then the random walk is called symmetric random walk. Note that here the state space is the
set of integers. Random walk on integers is an example of Markov chain (Why?).
3
Example 3.2. Board games played with dice: A game of snakes and ladders or any other game
whose moves are determined entirely by dice is a Markov chain, indeed, an absorbing Markov chain.
This is in contrast to card games such as blackjack, where the cards represent a ’memory’ of the past
moves. To see the difference, consider the probability for a certain event in the game. In the above-
mentioned dice games, the only thing that matters is the current state of the board. The next state of
the board depends on the current state, and the next roll of the dice. It doesn’t depend on how things
got to their current state.
Example 3.3. Does every stochastic processes have Markov property?
No, Every stochastic process does not have Markov property. For example, let an urn contains
two red balls and one green ball. One ball was drawn yesterday, one ball was drawn today, and the
final ball will be drawn tomorrow. All of the draws are “without replacement”.
Suppose you know that today’s ball was red, but you have no information about yesterday’s ball.
The chance that tomorrow’s ball will be red is 12 . That’s because the only two remaining outcomes
for this random experiment are “r,r,g” and “g,r,r”.
On the other hand, if you know that both today and yesterday’s balls were red, then you are
guaranteed to get a green ball tomorrow.
This discrepancy shows that the probability distribution for tomorrow’s color depends not only
on the present value, but is also affected by information about the past. This stochastic process of
observed colors doesn’t have the Markov property.
Remark For any random experiment, there can be several related processes some of which have the
Markov property and others that don’t.
For instance, if you change sampling “without replacement” to sampling “with replacement” in
the urn experiment above, the process of observed colors will have the Markov property.
Example 3.4. Let (Xn ) is a stochastic process. We can get a related Markov process by considering
the historical process defined by
Hn = (X0 , X1 , ..., Xn )
.
In this setup, the Markov property is trivially fulfilled since the current state includes all the past
history.
4
Definition A Markov chain {Xn : n = 0, 1, 2, ...} is said to be homogeneous or stationary transition
probabilities if ∀, i Pij (n, n + 1) does not depend on n i.e. P [Xn+1 = j|Xn = i] = P [X1 = j|X0 = i].
Definition A stochastic matrix (also termed probability matrix, transition matrix, substitution ma-
trix, or Markov matrix) is a matrix used to describe the transitions of a Markov chain. The (i, j)th
element of one step transition probability matrix ( denoted by P ) is pi,j i.e. P [X1 = j|X0 = i]. The
(i, j)th element of n-step transition probability matrix ( denoted by P (n) ) is pni,j i.e. P [Xn = j|X0 = i].
Example 3.5. Suppose a frog can jump between three lily pads, labeled 1, 2, and 3. We suppose that
if the frog is on lily pad number 1, it will jump next to lily pad number 2 with a probability of one.
Similarly, if the frog is on lily pad number 3, it will next jump to lily pad number 2 with a probability
of one. However, when the frog is on lily pad number 2, it will next jump to lily pad 1 with probability
1
4, and to lily pad 3 with probability 34 . This is discrete time discrete state space stochastic process with
state space S = {1, 2, 3}. The transition probability matrix is given by
 
0 1 0
 
 1 3 
P = 4 0 4 .
 
0 1 0
Definition A doubly stochastic matrix is a square matrix of nonnegative real numbers with each row
and column summing to 1.
Example 3.6. Is there any stochastic process which is non stationary or time-inhomogeneous?
Yes, there are many stochastic processes which are non stationary. For example
1. In a game such as blackjack, a player can gain an advantage by remembering which cards have
already been shown (and hence which cards are no longer in the deck), so the next state (or
hand) of the game is not independent of the past states.
2. Marital status model: A woman may be: never married, married for the first time, divorced,
widowed or remarried. We can draw a state space diagram, but transition probabilities change
with age.
3. Reversionary annuity: While a husband is alive he pays into a scheme which will make regular
payments to his wife after he is dead. The four states: H & W alive, H alive, W alive, neither
alive, and transitions are clear, but again probabilities are age-dependent.
5
Theorem 3.1. Chapman-Kolmogorov equation:
Let {Xn : n = 0, 1, 2, ...} be a homogeneous Markov chain. For i, j ∈ S and m, n ∈ {1, 2, 3, ...}
following holds.
(m+n) P (n) (m) P (m) (n)

1. pi,j = pi,k pk,j = pi,k pk,j
k∈S k∈S
(m+n) (m) (n)

2. pi,j ≥ pi,k pk,j ∀k∈S
Proof. 1.
(m+n)
pi,j = P [Xm+n = j|X0 = i]
= P [Xm+n = j, Xn = k for some k ∈ S|X0 = i]

X
= P [Xm+n = j, Xn = k|X0 = i]
k∈S
X
= P [Xm+n = j|Xn = k, X0 = i]P [Xn = k|X0 = i]
k∈S
X
= P [Xm+n = j|Xn = k]P [Xn = k|X0 = i]
k∈S
X
= P [Xm = j|X0 = k]P [Xn = k|X0 = i]
k∈S
(n) (m)
X
= pi,k pk,j (3.2)
k∈S
(m+n) P (m) (n)

Similarly by taking Xm = k, it can be shown that pi,j = pi,k pk,j
k∈S
(m+n) P (m) (n) (m+n) (m) (n)

2. Now pi,j = pi,k pk,j =⇒ pi,j ≥ pi,k pk,j ∀k ∈ S as each terms in right side is non-
k∈S
negative.
The second equation implies that starting from state i, the probability of going to state j at
(m + n)th step is more than probability of starting from state i going to state k at mth step and then
starting from state k probability of going to state j at nth step.
Definition • Initial Distribution: The probability distribution of the random variable X0 is called
is called initial distribution. For example P [X0 = i] = αi , i = 1, 2, 3, ... ∈ S. We may also
represent this as a row vector π (0) = (P (X0 = 1), P (X0 = 1), ...),
• Time-dependent distribution: defines the probability that Xn takes a value in a particular subset
of S at a given time n. Note that we can can calculate this distribution using initial distribution
6
(π (0) ) and transition probability matrix P . For a state j ∈ S
(n)
X X
P (Xn = j) = P (Xn = j, X0 = i) = P (X0 = i)pij = π (0) P (n) [, j] = π (0) P n [, j]
i∈S i∈S
• Stationary distribution: defines the probability that Xt takes a value in a particular subset of S
as t → ∞ (assuming the limit exists)
• Hitting probability: the probability that a given state is S will ever be entered
• First passage time: the instant at which the stochastic process first time enters a given state or
set of states starting from a given initial state
For any process, it is typically of interest to know the values of P (X0 = i0 , X1 = i1 , ...Xn = in ),
where n ∈ {0.1.2.3, ...} and i0 , i1 , ...in ∈ S. X0 = i0 , X1 = i1 , ...Xn = in is called a path of the Markov
chain. We use multiplication rule of probability to find the probability of a path.
The multiplication rule:
P (E1 E2 E3 ...En ) = P (E1 )P (E2 |E1 )P (E3 |E1 E2 )...P (En |E1 E2 ...En−1 ) (3.3)
Exercise 3.1. Show that for a set of n events such that P (Ei |E1 E2 ...Ei−1 ) = P (Ei |Ei−1 ), we have
the following multiplication rule.
P (E1 E2 E3 ...En ) = P (E1 )P (E2 |E1 )P (E3 |E2 )...P (En |En−1 ) (3.4)
Using this we can have
P (X0 = i0 , X1 = i1 , ...Xn = in ) = P (X0 = i0 )P (X1 = i1 |X0 = i0 )...P (Xn = in |Xn−1 = in−1 ).
Example 3.7. Let us consider a Markov chain with S = {1, 2, 3, 4..., n} and transition probability
matrix P = (pi,j )(n×n) . The probability of the path X0 = 0, X1 = 2, X2 = 4, X3 = 1 is given by
P (X0 = 0, X1 = 2, X2 = 4, X3 = 1) = P (X0 = 0)p0,2 p2,4 p4,1 .
(n)
Definition • A state j ∈ S is said to be accessible from state i ∈ S if pij > 0 for some n ≥ 0
(written as i → j).
• Two state i ∈ S and j ∈ S are said to communicate (written as i ↔ j) if i → j and j → i.
Definition The relation ↔ forms an equivalent relation (How?). The set of equivalence classes in a
DTMC are called the communication classes or, more simply, the classes of the Markov chain. If every
state in the Markov chain can be reached from every other state, then there is only one communication
class (all the states are in the same class).
7
Definition A Markov chain is said to be irreducible if there is only one equivalence class, i.e., if all
states communicate with each other.
Example 3.8. Consider a Markov chain with state space {1, 2, 3, 4} and transition probability matrix
P , where
 
1 1
2 2 0 0
 
1 1
0 0 
 
2 2

P = .
 1 1 1 1 
4 4 4 4 
 

0 0 0 1
Find the equivalence classes.
Whether all the states will be in a single communicating class? In that case all states will commu-
nicate with 1. So we check it in the following.
• Case 1: Does state 1 communicate with state 2?
Yes.
Note that p12 > 0. Hence 1 → 2
p21 > 0 =⇒ 2 → 1. Therefor 1 ↔ 2.
• Case 2: Does state 1 communicate with state 3?

(n)
Since p31 > 0, 3 → 1. But we can not find any n ≥ 0 for which p31 > 0. Hence 1 and 3 do not
communicate.
• Case 4: Does state 1 communicate with state 4? No. Because we can not find any n ≥ 0 for
(n) (m)
which p14 > 0 as well as we can not find any m ≥ 0 such that p41 > 0.
Hence all states can not be in one communicating class. More precisely, only 1 and 2 will be in same
communicating class.
Now another question arises. Whether 3 and 4 will be in same class or not? No. Because 3 is not
accessible from 4.
Hence the communication classes are {{1, 2}, {3}, {4}}.
Definition A state i ∈ S is said to be absorbing if pii = 1.
(n)
Lemma 3.1. A state i ∈ S is said to be absorbing if and only if pii = 1, ∀n ≥ 0.
8
(1)
Proof. Let i ∈ S is absorbing. Then pii = pii = 1.
(0)
Now pii = P (X0 = i|X0 = i) = 1
(2)
pii = p1+1
ii ≥ pii × pii = 1 × 1 = 1 (Using the second part of Chapman Kolkmogorov’s equation)
(m) (m)
Let us assume that pii = 1 for some m. Then pm+1
ii = pii × pii = 1 × 1 = 1.
(n)
Hence by mathematical induction pii = 1, ∀n ≥ 0,
(n)
Conversely, Let pii = 1, ∀n ≥ 0 and take n=1. We get pii = 1. Hence i ∈ S is absorbing.
Remark If a state i ∈ S is absorbing then it does not communicate to any other state (How?).
3.1 First return probabilities

(n)
Definition • Let fii denote the probability that starting from state i, the first return to state i
at the nth step,
(n)
fii = P [Xn = i, Xm 6= i, m = 1, 2, 3, ..., n − 1|X0 = i], n ≥ 1
(n) (0)
The probabilities fii are known as first return probabilities. Define fii = 0. Note that
(1) (n) (n)
fii = pii , but in general fii 6= pii . The first return probabilities represent the first time the
chain returns to state i; thus
∞
(n)
X
0≤ fii ≤ 1
n=1
A transient state is defined in terms of these first return probabilities.

∞
P (n)
• State i is said to be transient if fii < 1. State i is said to be recurrent (persistent) if
n=1
∞
P (n)
fii = 1.
n=1
(n)
Definition If state i is recurrent, then the set {fii }i≥1 defines a probability distribution for a random
variable (Tii ) which defines the first return time.
1. The mean of the distribution of Tii is referred to as the mean recurrence time to state i, denoted
µii = E(Tii ). For a recurrent state i,
∞
(n)
X
µii = nfii
n=1
2. If a recurrent state i satisfies µii < ∞, then it is said to be positive recurrent, and if it satisfies
µii = ∞, then it is said to be null recurrent.
9
Example 3.9. Let us consider a Markov
 Chain with state space S = {1, 2}, and the transition
1 1
2 2
probability matrix P, where P =   . Find the first return probabilities of each state.
1 2
3 3
(1) 1
f11 = P (X1 = 1|X0 = 1) = p11 =
2
(2) 1
f11 = P (X2 = 1, X1 6= 1|X0 = 1) = P (X2 = 1, X1 = 2|X0 = 1) = p12 p21 =
6
(3)
f11 = P (X3 = 1, X2 6= 1, X1 6= 1|X0 = 1) = P (X3 = 1, X2 = 2, X1 = 2|X0 = 1)
1 2
= p12 p22 p21 = ×
6 3
(4)
f11 = P (X4 = 1, X3 6= 1, X2 6= 1, X1 6= 1|X0 = 1) = P (X4 = 1, X3 = 2, X2 = 2, X1 = 2|X0 = 1)
2
1 2
= p12 p222 p21 = ×
6 3
Hence
(1) (2) (3) (4)

f11 = f11 + f11 + f11 + f11 + ...
2 !
1 1 2 2
= + 1+ + + ... = 1
2 6 3 3
Therefore the state 1 is recurrent. The mean recurrent time for the state 1 is
∞ 2
X (n) 1 1 1 2 1 2
µ11 = nf11 = 1 × + 2 × + 3 × × + 4 × × + ...
n=1
2 6 6 3 6 3
2 ! 2 !
1 1 2 2 1 2 2
= +2× 1+ + + ... + + + ... + ...
2 6 3 3 6 3 3
2 !
1 2 2 5
= 1+ 1+ + + ... =
2 3 3 2
Since the mean recurrent time is 52 (< ∞) state 1 is positive recurrent.

Similarly show that 2 is recurrent and find the mean recurrent time.
Exercise 3.2. 1. Show that absorbing is positive recurrent and mean recurrent time is 1.
2. Find the communication class 

of the following
 Markov chain with state space S = {1, 2} and tran-
0.5 0.5
sition probability matrix P =   . Find which states are transient, positive recurrent
0.3 0.7
and null recurrent. Find the mean recurrent time for the positive recurrent states.
n
(n) P (r) (n−r)
Lemma 3.2. 1. pij = fij pjj
r=1
10
n−1
(n) (n) P (r) (n−r)
2. fij = pij − fij pjj
r=1
Proof. 1.
(n)
pij = P (Xn = j|X0 = i)
= P (Xn = j, τj = r, r = 1, 2, .., n|X0 = i)

Xr
= P (Xn = j, τj = r|X0 = i)
r=1
r
X
= P (Xn = j, |τj = r, X0 = i)P (τj = r|X0 = i)
r=1
Xr
= P (Xn = j, |Xr = j, Xr−1 6= j, ..., X1 6= j, X0 = i)P (τj = r|X0 = i)
r=1
Xr
= P (Xn = j, |Xr = j)P (τj = r|X0 = i) (since Xn is MC)
r=1
Xr
= P (Xn−r = j, |X0 = j)P (τj = r|X0 = i) (since Xn is stationary)
r=1
r
(r) (n−r)
X
= fij pjj
r=1
2. Again from the first part we have

n−1
(n) (r) (n−r) (n) (0)
X
pij = fij pjj + fij pjj
r=1
(0)
Using the fact pjj = 1 we can easily prove the second part.
Lemma 3.3. For i, j ∈ S

∞ ∞
∞

P (n) P (n) P (n)
1. pij = fij pjj = fij 1 + pjj
n=1 n=0 n=1
P
∞
(n)
pij
n=1
2. fij = P
∞
(n)
1+ pjj
n=1
∞
(n) P (n)
3. sup pij ≤ fij ≤ pij
n≥1 n=1
11
Proof. 1.
∞ r
∞ X
(n) (r) (n−r)
X X
pij = fij pjj
n=1 n=1 r=1
∞ X ∞
(r) (n−r)
X
= fij pjj
r=1 n=r
∞ ∞
(r) (n−r)
X X
= fij pjj
r=1 n=r
∞
X (m)
= fij pjj
m=0
∞
!
(n)
X
= fij 1 + pjj
n=1
2. Can be done easily using (1).
3. We have already proved in Lemma 3.2 that
n
(n) (r) (n−r)
X
pij = fij pjj (3.5)
r=1
n
(n) (r) (n−r)
X
=⇒ pij ≤ fij (since pjj ≤ 1)
r=1
∞
(r)
X
≤ fij = fij
r=1
(n)
=⇒ pij ≤ fij ∀n
(n)
=⇒ sup pij ≤ fij
n≥1
From the part 2 we have

∞
P (n)
pij
n=1
fij = ∞
P (n)
1+ pjj
n=1
∞
P (n)
pij ∞
n=1 (n)
X
=⇒ = 1+ pjj ≥ 1
fij n=1
∞
(n)
X
=⇒ fij ≤ pij
n=1
12
Lemma 3.4. A state j is accessible from state i (i → j) if and only if fij > 0.
(n )
Proof. If state j is accessible from state i then ∃ n0 ≥ 0 such that pij 0 > 0.
∞
(n ) (n) (n)
Using the 3rd part of Lemma 3.3 we get 0 < pij 0 ≤ sup pij ≤ fij ≤
P
pij . Hence fij > 0.
n≥1 n=1
Conversely, let fij > 0. We need to show that j is accessible from state i.
∞
P (n)
Again using the same lemma we get 0 < fij ≤ pij .
n=1
∞
P (n) (n0 )
Since pij > 0 ∃ some n0 ≥ 1 such that pij > 0. Hence j is accessible from state i.
n=1
Remark Two states i and j communicate (i ↔ j) if and only if fij > 0 and fji > 0.
∞
P (n)
Theorem 3.2. A state i is recurrent (transient) if and only if pii diverges (converges), i.e.,
n=0
∞
(n)
X
pii = ∞ (< ∞)
n=0
∞
P (n)
Proof. We shall prove that the state i is recurrent if and only if pii = ∞. Rest of the thing follows
n=0
by taking negation.
Let us assume that i is recurrent i.e. fii = 1.
∞ ∞
P (n) P (n)
If we put j=i in the Lemma 3.3 we get pii = fii 1 + pii .
n=1 n=1
∞
P (n)
We shall prove this by contradiction. If possible let ∞ pii <
n=0
∞ ∞
P (n) P (n)
Since i is recurrent, we have pii = 1 + pii =⇒ 0 = 1. Which is a contradiction.
n=1 n=1
∞
P (n)
Hence pii < ∞ is not possible.
n=0
∞ ∞
P (n) P (n)
However, 0 ≤ pii ≤ ∞, implies pii = ∞.
n=0 n=0
∞
P (n)
Conversely, Let us assume that pii = ∞. We shall show that i is recurrent i.e., fii = 1.
n=0
N N
P (n) P (n)
Let us consider the sequences aN = pii and bN = fii
n=1 n=1
13
N X
n
(r)
X
n−r
aN = fii pjj
n=1 r=1
N X N
(r)
X
n−r
= fii pjj
r=1 n=r
N N
(r)
X X
n−r
= fii pjj
r=1 n=r
N N −r
(r)
X X
= fii pm
jj
r=1 m=0
N N
(r)
X X
≤ fii pm
jj
r=1 m=0
≤ bN (1 + aN )
bN
=⇒ aN ≤
1 − bN
∞
P (n)
Note that lim bN = fii = fii ≤ 1.
N →∞ n=1
If lim bN = 1 then is recurrent and the proof is done. So we only need to check if lim bN < 1.
bN l
Let lim bN = l(l < 1). =⇒ lim(1−bN ) = 1−l =⇒ lim 1−bN
= 1−l < ∞. Which again contradicts
∞
P (n)
that lim aN = pii < ∞.
n=0
Hence lim bN < 1 is not possible. So i is recurrent.
Theorem 3.3. If i ↔ j then
1. i is recurrent ⇐⇒ j is recurrent.
2. i is transient ⇐⇒ j is transient.
Proof. We shall assume that i is recurrent and we shall prove that j is recurrent when i ↔ j.
(m0 ) (n )
Since i ↔ j ∃ m0 , n0 ∈ {0, 1, 2, ...} such that pij > 0 and pji 0 > 0. Using the 2nd part of
Chapman Kolmogorov equation we have,
0 +n0 (m0 ) (n0 ) (n)

pn+m
jj ≥ pij pji pii
∞ ∞
(n+m0 +n0 ) (m ) (n ) (n)
X X
=⇒ pjj ≥ pij 0 pji 0 pii
n=1 n=1
∞ ∞ ∞
(n) (n+m +n ) (n)
X X
pjj 0 0 = ∞
P
=⇒ pjj ≥ (Since i is recurrent pii = ∞)
n=1 n=1 n=1
14
=⇒ j is recurrent
∞
P (n) (n)
Theorem 3.4. If j ∈ S is transient, then prove that ∀ i ∈ S, pij < ∞ and lim pij = 0.
n=1 n→∞
∞
∞

P (n) P (n)
Proof. We have pij = fij 1 + pjj <∞
n=1 n=1
∞
P (n) (n)
Since the series pij is convergent lim pij = 0.
n=1 n→∞
Remark Probability of going to a transient state from any state is zero for large n.
Theorem 3.5. In a finite Markov chain all states can’t be transient.
Proof. A Markov chain is finite if State Space has finite elements. Let us consider the Markov chain
has M states and S = {1, 2, 3, ..., M }.
(n)
If possible, let all states are transient then lim pij = 0 ∀j, irrespective of initial sate i.
n→∞
Again for a Markov chain
M
(n)
X
pij = 1 ∀n
n=1
M
(n)
X
Taking limit, lim pij = 1
n=1
M
X
=⇒ 0 = 1,
n=1
which is a contradiction. Hence all states of a Markov Chain can not be transient.
Theorem 3.6. In a finite and irreducible Markov chain all states are recurrent.
Proof. Since the Markov chain is irreducible, all states are either transient or all states are recurrent.
In the last Theorem (3.5) we have proved that all states can not be transient. Hence all states of a
finite, irreducible Markov chain are recurrent.
Exercise 3.3. Consider a homogeneous Markov chain {Xn : n ≥ 0} with state space S and transition
probability matrix P. In each of thefollowing cases,determine which states are transient and which
0 0 12 12
 
 1 0 0 0 
 
are recurrent? S = {1, 2, 3, 4}; P = 
 

 0 1 0 0 
 
0 1 0 0
15
Theorem 3.7. i ∈ S recurrent ⇐⇒ E[number of returns to i |X0 = i] = ∞
Proof. Let X0 = i and define

1 if Xn = i

τi =
0 otherwise.

E[number of returns to i|X0 = i] = E[τi ] (3.6)

X∞
= 1 × P (Xn = i|X0 = i)
n=1
∞
(n)
X
= pii
n=1
= ∞ (since i is recurrent)
Example 3.10. The Gambler’s Ruin Problem: Consider a gambler who at each play of the game
has probability p of winning one unit and probability q = 1 - p of losing one unit. Assuming that
successive plays of the game are independent, what is the probability that, starting with i units, the
gambler’s fortune will reach N before reaching 0? If we let Xn denote the player’s fortune at time n,
then the process {Xn , n = 0, 1, 2, ...} is a Markov chain with transition probabilities
p00 = pN N = 1,
pi,i+1 = p = 1 − pi,i−1 , i = 1, 2, ..., N − 1
This Markov chain has three classes, namely, {0}, {1, 2, ..., N − 1}, and {N }; the first and third
class being recurrent and the second transient. Since each transient state is visited only finitely often,
it follows that, after some finite amount of time, the gambler will either attain his goal of N or go
broke.
4 Restricted Random walk models:
A restricted random walk is a random walk with at least one boundary.
Example 4.1. 1. {0, 1, 2, . . . , N}-finite, 2 boundaries at 0 and N .
16
2. {0,1,2, . . . } - semi infinite, 1 boundary at 0.
Assumptions about movement at the boundaries, x = 0, or x = N differ from movement at other

positions.
Absorbing boundary: An absorbing boundary at x = 0 assumes the one step transition
probability is p00 = 1.
Reflecting boundary: A reflecting boundary at x = 0 assumes the transition probabilities are
p11 = 1 − p and p12 = p.
Elastic boundary: An elastic boundary at x = 0 assumes the transitions probabilities are
p12 = p, p11 = sq, p10 = (1 − s)q, p + q = 1, p00 = 1, for 0 < p, s < 1.
Check the cases s = 0 and s = 1.
Example 4.2. 1. Simple random walk on {1, 2, 3, 4} with absorbing boundaries at 1 and 4, which
has the following transition probability matrix P.
 
1 0 0 0
 
 q 0 p 0 
 
P =


 0 q 0 p 
 
0 0 0 1
2. Simple random walk with reflecting boundaries at 0 and 5.

 
1 0 0 0 0 0
 
0 q p 0 0 0
 
 
 
0 q 0 p 0 0
 
 
P =




 0 0 q 0 p 0 

 

 0 0 0 q p 0 

0 0 0 0 0 1
Note that from state 1, the particle tries to jump to 0, but is immediately reflected back to 1 with
probability q.
17
3. Consider a random walk with reflecting boundary at 5 but elastic boundary at 0.
 
1 0 0 0 0 0
 
 (1 − s)q sq p 0 0 0
 

 
0 q 0 p 0 0
 
 
P =




 0 0 q 0 p 0 

 

 0 0 0 q p 0 

0 0 0 0 0 1
4. Cyclic random walk on {1, 2, 3, 4}
 
0 p 0 q
 
 q 0 p 0 
 
P =
 

 0 q 0 p 
 
p 0 q 0
Example 4.3. Physical applications include e.g. position of a particle on the x-axis. This particle
starts from the position z and moves in each step in positive or negative direction. The position of the
particle after n steps is the gambler’s capital after n trials. The process terminates when the particle
reaches first time position 0 or a. We say that the particle performs a random walk with absorbing
barriers at 0 and a. This random walk is restricted to positions 1, 2, . . . a− 1. In the absence of barriers
the random walk is called unrestricted.
An elastic barrier is a generalization uniting both absorbing and reflecting barrier. When particle
reaches position next to a reflecting barrier, say e.g. 1, then it has the probability p to move to 2 and
the probability q to stay at 1 (particle tries to go to 0, but is reflected back).
The elastic barrier at the origin works in the following way. From position 1 the particle
1. moves with probability p to position 2.
2. stays in 1 with probability ǫq at 1.
3. with probability (1 − ǫ)q it moves to 0 and is absorbed (process terminates).
For ǫ = 0 we get the absorbing barrier. For ǫ = 1 we get the reflecting barrier. In correlated random
walk each step is of 1 unit length and all directions are equally likely.
18
5 Period of a state
The period of a state i, denoted as d(i), is the greatest common divisor of all integers n ≥ 1 for which
(n)
pii > 0. That is,
(n)
d(i) = g.c.d.{n|pii > 0 and n ≥ 1}.
Equivalently, the state i has period d if pnii = 0 unless n = vd, is a multiple of d.

If a state i has period d(i) > 1, it is said to be periodic of period d(i). If d(i) = 1, it i is said to be
(n)
aperiodic. If pii = 0 for all n ≥ 1, define d(i) = 0.
Example 5.1. Find the periods of the following Markov chain.

 
0.5 0.3 0.2
 
P =  0.2 0.5 0.3 
 
 
0.1 0.5 0.4
(n)
Here pii > 0 ∀i = 1, 2, 3. So all states are aperiodic.
Theorem 5.1. Let Xn be a Markov chain with state space S. If i, j ∈ S are in the same communication
class, then d(i) = d(j). That is they have the same period.
Example 5.2. Find the periods of the following Markov chain with S = {1, 2, 3}.
 
0 0.5 0.5
 
P =  0.5 0 0.5 
 
 
0.5 0.5 0
If we calculate the P 2 and P 3 we get

 
0.5 0.25 0.25
 
P 2 =  0.25 0.5 0.25 
 
 
0.25 0.25 0.5
 
0.25 0.375 0.375
 
3
P =  0.375 0.25 0.375 
 
 
0.375 0.375 0.25
Hence the period of the state 1 is g.c.d.(2, 3, ..) = 1. Similarly period of each state is 1.
19
Example 5.3. Find the periods of the following Markov chain with S = {1, 2, 3}.
 
0 1 0
 
P =  0.5 0 0.5 
 
 
0 1 0
To find the period, we calculate the P 2 , P 3 and P 4 etc.

 
0.5 0 0.5
 
P2 =  0 1 0 
 
 
0.5 0 0.5
 
0 1 0
 
P 3 =  0.5 0 0.5 
 
 
0 1 0
 
0.5 0 0.5
 
P4 =  0 1 0 
 
 
0.5 0 0.5
Hence the period of the state 1 is g.c.d.(2, 4, ..) = 2. Similarly period of each state is 2.
Definition An aperiodic positive recurrent state is called ergodic. If the Marcov chain is irreducible
and all states are ergodic then the Markov chain is called ergodic Markov chain.
Theorem 5.2. 1. If j is positive recurrent, then
(nt) t
lim pjj = ,
n→∞ µjj
where t is period of state j.
2. If j is recurrent null (whether periodic or aperiodic) then
(nt)
lim pjj = 0.
n→∞
3. If k is recurrent null then for any j ∈ S,
(n)
lim p = 0.
n→∞ jk
20
4. If k is aperiodic and positive recurrent, then for any j ∈ S
(n) fjk
lim pjk = .
n→∞ µkk
Proof. Beyond the scope of this note.
Definition Two states i, j ∈ S are said to be of the same type if they have the same classification.
That is
1. i and j have the same period, and
2. either
both i and j are transient,
or
both i and j are recurrent positive,
or
both i and j are recurrent null.
Definition A set C of states is said to be closed if once the process enters it, it can not get out of it.
(n)
i.e., if fij = 0 ∀i ∈ C, j ∈ C c . If C is closed then pij = 0 ∀n ≥ 1, ∀ i ∈ C, j ∈ C c .
6 Stationary Distribution
It was noted earlier that one of the goals of the theory of Markov chains is to establish that under
certain hypotheses, the distribution of states tends to a limiting distribution. If indeed this is the case
πj = 1 such that π (0) P n → π, where
P
then there is a row vector π = (π1 , π2 , ...) with πj ≥ 0 and
π (0) is the initial distribution. Note that π (0) P n is again a vector (P (Xn = 1), P (Xn = 2), ...).
Definition Consider a Markov chain with transition matrix P. A non-negative vector π is said to be
an invariant measure if
π′P = π′ ,
which in component form is

X
πi = πj pji ∀ i ∈ S.
j
21
P
If π also satisfies πk = 1, then π is called a stationary, equilibrium or steady state probability
k
distribution. Thus, a stationary distribution is a left eigenvector of the transition matrix
with associated eigenvalue equal to one.
Example 6.1. Let us consider the Markov  Chain with state space S = {1, 2} and transition proba-
0.5 0.5
bility matrix P, where P =   . The eigenvalues are 1, −0.2. The left eigenvectors of P
0.7 0.3
corresponding to theeigenvalue 1 isthe ′
 right
 eigenvector
  of P corresponding to 1. Let (x,y)’ be the
−0.5 0.7 x 0
eigenvector. Hence    =
0.5 −0.7 y 0
Solving, we get 5x = 7y = k(say). However this eigenvector will be a probability distribution if
k k 35
x + y = 1 =⇒ 5 + 7 = 1 =⇒ k = 12 .
7 5
Therefore x = 12 , y = 12 .
7 5
Note that lim P (Xn = 1) = 12 and lim P (Xn = 2) = 12 .
n→∞ n→∞ 
7 5
12 12
Again we can calculate lim P (n) =   . Note that all rows are identical, which says that
n→∞ 7 5
12 12
the distribution of Xn does not depend on initial state i.e., P (Xn = j|X0 = i) is independent of the
initial state i.
Lemma 6.1. For a Markov chain Xn , the distribution of Xn is independent of n ⇐⇒ the initial
distribution is a stationary distribution.
Proof: Let aj = P [X0 = j], j = 1, 2, ...
First suppose that the distribution of Xn does not depend on n. Then
aj = P [X1 = j]
X∞
= P [X1 = j|X0 = k]P [X0 = k]
k=1
X∞
= pkj ak
k=1
=⇒ {a1 , a2 , ...} is a stationary distribution.

Conversely, Suppose that {a1 , a2 , ...} is a stationary distribution. Then,
22
P [X0 = i] = aj
X∞
= ak pnkj
k=1
X∞
= P [X0 = k]P [Xn = j|X0 = k]
k=1
= P [Xn = j].
Definition When the distribution of Xn is independent of n, we say that the process in steady
equilibrium.
Our aim is to find under what condition, a Markov chain, regardless of the initial state j, reaches
(n)
a steady or stable state after large no. of transitions, i.e., under what conditions limn→∞ pjk = πk
exists ∀j.
Example 6.2. Let us consider a Markov chain with state space S = {1, 2, 3, 4} and transition proba-
bility matrix  
1 3 1
0 5 5 5
 
 1 1 1 1 
 4 4 4 4

P =

.

 1 0 0 0 
 
1 1
0 2 2 0
It is your exercise to verify that (using R)
 
0.3731 0.1791 0.3284 0.1194
 
 0.3731 0.1791 0.3284 0.1194 
 
lim P (n) =  
n→∞ 
 0.3731

0.1791 0.3284 0.1194 
 
0.3731 0.1791 0.3284 0.1194
Hence unique stationary distribution exists which is given by π = (0.3731, 0.1791, 0.3284, 0.1194).
Verify this finding the left eigenvector of P ′ .
Example 6.3. Let us consider the following Markov chain with S = {1, 2, 3}.
 
0 1 0
 
P =  0.5 0 0.5 
 
 
0 1 0
If we calculate the higher order transition probability matrices we get
23
 
0 1 0
 
P 2n+1 =  0.5 0 0.5 
 
 
0 1 0
and  
0.5 0 0.5
 
P 2n =  0 1 0 .
 
 
0.5 0 0.5
Remarks: The following points are important.
• Transition probabilities are not independent on n.
• Rows are not identical for all transition matrices.
• limiting distribution does not exist!!!
• What about period of the Markov chain???
Note that we have already derived that the Markov chain is periodic and the period of each state is
2.
Example 6.4. Let us consider a Markov chain with state space S = {1, 2, 3, 4} and transition proba-
bility matrix  
1 0 0 0
 
1 1
0 0 
 
2 2

P = .
 1 2 
 3 0 0 3 


0 0 0 1
It is your exercise to verify that
 
1 0 0 0
 
 0.6667 0 0 0.3333 
 
lim P (n) = 
n→∞  
 0.3333 0 0 0.6667 
 
0 0 0 1
Hence unique stationary distribution does not exist.
Hence the following questions arise:
1. Under what conditions on a Markov chain will a stationary distribution exist?
24
2. When a stationary distribution exists, when is it unique?
3. Under what conditions can we guarantee convergence to a unique stationary distribution?
Theorem 6.1. If a Markov chain is irreducible and recurrent, then there is an invariant measure π,
unique up to multiplicative constants, that satisfies 0 < πj < 1 for all j ∈ S. Further, if the Markov
chain is positive recurrent then
1
πi = .
µii
Theorem 6.2. Suppose a Markov chain is irreducible and that a stationary distribution π exists:
X
π ′ = π ′ P, pij = 1, πj > 0.
j∈S
Then, the Markov chain is positive recurrent.

Thus, a necessary and sufficient condition for determining positive recurrence is simply demonstrating
the existence or non-existence of a stationary distribution.
Remark If stationary distribution exists then,
1
πi = = lim P [Xn = i], i ∈ S.
µii n→∞
Example 6.5. Let us consider 

the Markov
 Chain with state space S = {1, 2}, and the transition
1 1
2 2
probability matrix P, where P =   . We know that the Markov chain is irreducible and finite.
1 2
3 3
Hence all states are positive recurrent and since the Markov chain is irreducible the Markov chain
is positive recurrent. It is also easy to verify that the Markov chain is aperiodic. Hence stationary
distribution distribution exists. In the following we will verify the above remark.
Let us find the stationary distribution. The eigenvalues are 1, 16 . The left eigenvectors of P cor-
responding to the eigenvalue
 1isthe 
righteigenvector
 of P ′ corresponding to 1. Let (x, y)′ be the
1 1
3  x  0
eigenvector. Hence  2 = 
1 1
2 −3 y 0
x y
Solving, we get 2 = 3 = k(say). However this eigenvector will be a probability distribution if
x + y = 1 =⇒ 2k + 3k = 1 =⇒ k = 15 .
Therefore x = 25 , y = 35 .
2
Note that lim P (Xn = 1) = 5 and lim P (Xn = 2) = 35 .
n→∞ n→∞
We have already proved (see Example 3.9) that µ11 = 52 . Hence P (X1 = 1) = 1
µ11 . Same thing can
be verified for the state 2 also in a similar way.
25
Example 6.6. Here we consider a Markov chain with state space S = {1, 2, 3, 4, 5} and transition
probability matrix  
1 2
3 3 0 0 0
 
 3 1
0 0 0 

 4 4
 
P = 0 1 1 5 
.
0

 8 4 8 

 0 1 1 
0 0 2 2 


1 2
0 0 3 0 3
It is easy to verify that the Markov chain is not irreducible and its communication classes are {{1, 2}, {3, 4, 5}}.
 
0.5294 0.4706 0.0000 0.0000 0.0000
 
 0.5294 0.4706 0.0000 0.0000 0.0000 
 
 
We can find that limn→∞ P (n) =  0.0000 0.0000 0.2424 0.1212 0.6364  .
 
 
 
 0.0000 0.0000 0.2424 0.1212 0.6364 
 
0.0000 0.0000 0.2424 0.1212 0.6364
We have the following observations:
• First two rows identical, and last three rows are identical.
• Limiting transition probabilities are not independent of initial state.
Let us justify why the limiting distribution does not exist?
• Markov Chain is not irreducible.
• Markov Chain is aperiodic (?) No. (Why?)
7 References
1. Anderson, D. F.: Introduction to Stochastic Processes with Application in Biosciences
2. Allen, Linda J. S.: An Introduction to Stochastic Processes with Applications to Biology
3. Ross, S. M.: Stochastic Processes
4. Ross, S. M.: Introduction to Probability Models
5. Karlin, S. and Taylor, H. M.: A First Course in Stochastic Processes
26

Notes On Stochastic Processes: 1 Learning Outcomes

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Notes On Stochastic Processes: 1 Learning Outcomes

Загружено:

Авторское право:

Доступные форматы

Notes on Stochastic Processes 1

2 Introduction to stochastic processes 1

3 Discrete time discrete state space stochastic processes: 3

4 Restricted Random walk models: 16

At the end of this chapter a Student is expected to

2. derive the transition probability matrix.

3. find the communication classes.

4. find behavior of the states of a Markov chain.

5. find the limiting distribution, if it exists.

2 Introduction to stochastic processes

Fn = Fn−1 + Fn−2 for n > 2 (2.1)

2. Let us consider the growth governed by the following equation.

Example : Random walk on the set of integers as described above.

Example: Population size of any species over time.

Example: Brownion Motion

3 Discrete time discrete state space stochastic processes:

P [Xn+1 = in+1 |X0 = i0 , ..., Xn = in ] = P [Xn+1 = in+1 |Xn = in ] (3.1)

Example 3.3. Does every stochastic processes have Markov property?

(m+n) P (n) (m) P (m) (n)

(m+n) (m) (n)

= P [Xm+n = j, Xn = k for some k ∈ S|X0 = i]

(m+n) P (m) (n)

(m+n) P (m) (n) (m+n) (m) (n)

Using this we can have

P (X0 = i0 , X1 = i1 , ...Xn = in ) = P (X0 = i0 )P (X1 = i1 |X0 = i0 )...P (Xn = in |Xn−1 = in−1 ).

• Two state i ∈ S and j ∈ S are said to communicate (written as i ↔ j) if i → j and j → i.

• Case 1: Does state 1 communicate with state 2?

Note that p12 > 0. Hence 1 → 2

p21 > 0 =⇒ 2 → 1. Therefor 1 ↔ 2.

• Case 2: Does state 1 communicate with state 3?

Definition A state i ∈ S is said to be absorbing if pii = 1.

3.1 First return probabilities

A transient state is defined in terms of these first return probabilities.

(1) (2) (3) (4)

Since the mean recurrent time is 52 (< ∞) state 1 is positive recurrent.

2. Find the communication class 

= P (Xn = j, τj = r, r = 1, 2, .., n|X0 = i)

2. Again from the first part we have

Lemma 3.3. For i, j ∈ S

2. Can be done easily using (1).

3. We have already proved in Lemma 3.2 that

From the part 2 we have

Theorem 3.3. If i ↔ j then

0 +n0 (m0 ) (n0 ) (n)

Theorem 3.5. In a finite Markov chain all states can’t be transient.

Proof. Let X0 = i and define

E[number of returns to i|X0 = i] = E[τi ] (3.6)

pi,i+1 = p = 1 − pi,i−1 , i = 1, 2, ..., N − 1

4 Restricted Random walk models:

A restricted random walk is a random walk with at least one boundary.

Example 4.1. 1. {0, 1, 2, . . . , N}-finite, 2 boundaries at 0 and N .

Assumptions about movement at the boundaries, x = 0, or x = N differ from movement at other

Check the cases s = 0 and s = 1.

2. Simple random walk with reflecting boundaries at 0 and 5.

4. Cyclic random walk on {1, 2, 3, 4}

1. moves with probability p to position 2.

2. stays in 1 with probability ǫq at 1.

3. with probability (1 − ǫ)q it moves to 0 and is absorbed (process terminates).

Equivalently, the state i has period d if pnii = 0 unless n = vd, is a multiple of d.

Example 5.1. Find the periods of the following Markov chain.

If we calculate the P 2 and P 3 we get

To find the period, we calculate the P 2 , P 3 and P 4 etc.