Reminder of Stochastic Processes: Lecturer: Dmitri A. Moltchanov E-Mail: Moltchan@cs - Tut.fi

Reminder of stochastic processes
Lecturer: Dmitri A. Moltchanov

E-mail: moltchan@cs.tut.fi
http://www.cs.tut.fi/kurssit/ELT-53606/
Network analysis and dimensioning I D.Moltchanov, TUT, 2013
OUTLINE:
• Definition and basic notions;
• Characterization of stochastic processes;
• Classifications of stochastic processes;
• Ergodic processes;
• Markov processes;
• Discrete-time homogenous Markov chains;
• Continuous-time homogenous Markov chains;
• Classification of states;
• Ergodic chains and stationary distribution.
Lecture: Reminder of stochastic processes 2

1. Definitions and notions

Assume the following:
• (Ω, F, P ) is the probability space;
• there are random variables defined on this space.
A set of random variables indexed by some parameter t, S(t) ∈ E:
S = {S(t), t ∈ T } (1)
• is called a stochastic process, where E is the state space of the process;

• in other words for each t ∈ T , S(t) is a mapping from Ω to some set E.
Examples of processes:
• packet arrival process to IP router;
• call arrival process to telephone exchange, etc.

DEFINING SET T :
• set T is the index set of stochastic process;
• set T is often (not always!) referred to as time.
If T is countable:
T = ℵ = {0, 1, . . . },
T = Z = {. . . , −2, −1, 0, 1, 2, . . . } (2)
• we are given a discrete-time stochastic process.
If T is not countable:
T = < = (−∞, ∞),

T = (0, ∞), (3)
• we are given a continuous-time stochastic process.

DEFINING SET E:
• set E is called state space of the stochastic process {S(t), t ∈ T }.
If E is countable:
E = ℵ = {0, 1, . . . },
E = Z = {. . . , −2, −1, 0, 1, 2, . . . } (4)
• we are given discrete-space stochastic process.
If E is not countable:
E = < = (0, ∞),

E = (−∞, ∞). (5)
• we are given continuous-space stochastic process.

Space E
4
3 .10
4
2.8 .10
4
2.6 .10
4
2.4 .10
4
2.2 .10
10 15 20 25 30 35 40 45 50
Space T
Figure 1: Observations of discrete-time discrete-space stochastic process.

1.1. Sections and trajectories

Do the following:
• observe a certain stochastic process a number of times.
S( t )
10
0
0 1.25 2.5 3.75 5 6.25 7.5 8.75 10 t
Figure 2: Realizations and sections of stochastic process.
• fix certain curve: we get a trajectory (realization) of stochastic process;

• fix certain t from T : we get a section of stochastic process.

2. Characteristics of stochastic processes

Observe the following:
• it is still unclear how to formally characterize a stochastic processes;
• recall: any RV can be fully characterized by its CDF (pdf or pmf).
Similarly, we can characterize an arbitrary section:
F (t, s) = P r{S(t) ≤ s}. (6)
• depends on both t and s;

• is called one-dimensional distribution function of {S(t), t ∈ T }.
F (t, s) does not completely characterize the stochastic process:

• example: what if there is dependence between subsequent sections?

More information: two-dimensional distribution function:
S(t1 , s1 , t2 , s2 ) = P r{S(t1 ) ≤ s1 , S(t2 ) ≤ s2 }. (7)
By induction:
S(t1 , s1 , t2 , s2 , . . . ) = P r{S(t1 ) ≤ s1 , S(t2 ) ≤ s2 , . . . }. (8)
We observe the following:

• full description is given by joint distribution of all its sections;
• it is not easy to deal with such description.
What we usually do in practice:
• we usually do not use general processes:
– example: Markov processes: two-dimensional distribution is sufficient.
• we may also limit description of processes to moments:
– example: correlation theory: mean and autocorrelation function.

2.1. Mean
Mean of the stochastic process {S(t), t ∈ T }:
• non-probabilistic function ms (t);
• for all t ∈ T equals to the mean of corresponding section:
Z ∞
ms (t) = xs(t, x)dx. (9)
−∞
S( t )
10
0
0 1.25 2.5 3.75 5 6.25 7.5 8.75 10 t
Figure 3: Mean of stochastic process {S(t), t ∈ T } (denoted by black line).

2.2. Variance
Variance of the stochastic process {S(t), t ∈ T }:
• non-probabilistic function Ds (t);
• for all t ∈ T equals to the variance of corresponding section:
Z ∞
Ds (t) = (x − ms (t))2 s(t, x)dx. (10)
−∞
One can similarly define:

• excess;
• kurtosis;
• higher moments and central moments.
Important:
• these moments characterize sections in isolations!
• we still need descriptor of dependence.

2.3. Autocorrelation function

Autocorrelation function (ACF):
• non-probabilistic function Ks (t1 , t2 );
• for all pairs t1 , t2 ∈ T is just a covariance of corresponding sections:
Ks (t1 , t2 ) = E[S(t1 ), S(t2 )] − mx (t1 )mx (t2 ). (11)
– recall: this is a measure of linear dependence between sections.
Normalized autocorrelation function (NACF):
• non-probabilistic function Rs (t1 , t2 );
• for all pairs t1 , t2 ∈ T is just a correlation coefficient of corresponding sections:
E[S(t1 ), S(t2 )] − ms (t1 )ms (t2 )
Rs (t1 , t2 ) = p . (12)
Ds (t1 )Ds (t2 )
Important notes: NACF and ACF can be used interchangeably:
• since −1 ≤ Rs (t1 , t2 ) ≤ 1 NACF is often preferable;
• ACF includes variance as Ks (t1 , t1 )!

3. Classifications
3.1. Based on stationarity
General classification:
• stationary processes:
– at least mean and ACF do not depend on time.
• non-stationary processes:
– mean or ACF or both depend on time;
– practically, they may change in time.
Note: stationarity is advantageous property:
• correlation theory is quite developed;
• we are limited to first two moments (K(ti , ti ) = D(ti )) and linear dependence!
Notes on non-stationary processes:
• very hard to deal with;
• it seems that most processes observed in networks are somewhat non-stationary!

3.2. Based on the type of stationarity

Stochastic process can be:
• strictly stationary:
– n-dimensional distribution function does not changes with the time shift τ for all n;
– for (t1 , t2 , . . . , tn ) and (t1 + τ, t2 + τ, . . . , tn + τ ): n-dimensional distribution is the same.
• covariance stationary processes:
– mean is constant in time;
– ACF depends on the time shift only:
Ks (t1 , t2 ) = Ks (τ ), τ = |t2 − t1 |. (13)
Note, that strictly stationary processes are also called:

• first-order stationary processes.
Note, that covariance stationary processes are also called:
• weakly stationary processes;
• second-order stationary processes.

3.3. Based on memory

• with one-step ahead dependence:
– called Markov processes;
– named after A.A. Markov;
– can be completely characterized by the set of two-dimensional distributions;
– most important processes for many applied fields due to analytical tractability.
• other processes.
3.4. Based on ergodic property

• ergodic processes;
• not ergodic ones.

4. Ergodic stationary processes

The main property:
• single trajectory gives all information regarding characteristics of the process;
• for example: mean, variance, ACF, etc.
Notes on ergodic property:
• a trajectory must be observed for long enough...
• how long is long enough?...
Mean of stationary ergodic process is given by:
Z T
1
ms = E[S(t)] = lim S(t)dt. (14)
t→∞ 2T −T
Variance of stationary ergodic process is given by:

Z T
1
Ds = D[S(t)] = lim (S(t) − ms )2 dt. (15)
t→∞ 2T −T

ACF of stationary ergodic process is given by:
Z T
1
Ks (τ ) = lim (S(t) − ms )(S(t − τ ) − ms )dt. (16)
t→∞ 2T −T
Sufficient condition of ergodicity is given by:
lim Ks (τ ) = 0. (17)
τ →∞
Summarizing ergodic property:

• allows to easily compute characteristics using a single realization of the process;
• one have to be sure that the process you observe is ergodic. Tough do to!!!
Practically, process is not stationary ergodic:

• when some type of heterogeneity is found...
• could be deterministic influence at some instants of time, etc.

5. Markov processes
Historical facts:
• A.A. Markov is Russian mathematician (1856 − 1922):
– first results in 1906 (discrete-space Markov processes);
– extension of the law of large numbers for dependent events;
– contributors: Kolmogorov, Khinchine, Fokker, Plank, Uhlenbeck, Wiener, Feller,..
• note: classic queuing theory is mostly based on Markovian processes!
• we are given a discrete-time, discrete-space process {S(t), t ∈ T }.
• stochastic process {S(t), t ∈ T } is called Markov process if it satisfies:
P r{S(tn+1 ) = sn+1 |S(tn ) = sn , . . . , S(tn−1 ) = sn−1 } = P r{S(tn+1 ) = sn+1 |S(tn ) = sn }. (18)
– called Markov property;

– also called memoryless property, one-step memory, memoryless process, etc.

5.1. Description and time-homogeneity

states states
6 p36(n) 6 p36(n+1)
5 p35(n) 5 p35(n+1)
4 p34(n) 4 p34(n+1)
3 p33(n) 3 p33(n+1)
2 p32(n) 2 p32(n+1)
1 p31(n) 1 p31(n+1)
0 … n n+1 n+2 n+3 t 0 … n n+1 n+2 n+3 t
Figure 4: Probability to go in one step at time instants n and n + 1.
A Markov process at time n is fully defined by:

pij (n) = P r{S(n + 1) = j|S(n) = i}, i, j ∈ E. (19)
A Markov process at time (n + 1) is fully defined by:

pij (n + 1) = P r{S(n + 2) = j|S(n + 1) = i}, i, j ∈ E. (20)

Similarly, a Markov process at time (n + m) is fully defined by:
pij (n + m) = P r{S(n + m + 1) = j|S(n + m) = i}, i, j ∈ E. (21)
We may use matrix for one-step transitions from i to j between times n and (n + 1):
 
p11 (n) p12 (n) p13 (n) · · · p1M (n)
 
 p (n) p (n) p (n)
 21 22 23 · · · p2M (n) 
 M
  X
P (n) = 
 p31 (n) p32 (n) p33 (n) ,
· · · p3M (n)  pji (n) = 1, ∀j, ∀n. (22)
 . .. .. ... .. i=1
 ..

 . . . 

pM 1 (n) pM 2 (n) pM 3 (n) · · · pM M (n)
Definitions:
• Markov chain whose transition probabilities depend on time is non-homogenous;
• Markov chain whose transition probabilities do not depend on time is homogenous.

We can drop the dependence on time for these chains and write:
pij = P r{S(n + 1) = j|S(n) = i}, (23)
• as the transition probabilities from state i to state j;

• indices n and (n + 1) are used to denote one-step transition probabilities.
states states
6 p26 p26 6 p36 p36
5 p25 p25 5 p35 p35
4 p24 p24 4 p34 p34
3 p23 p23 3 p33 p33
2 p22 p22 2 p32 p32
1 p21 p21 1 p31 p31
0 … n n+1 n+2 n+3 t 0 … n n+1 n+2 n+3 t
Figure 5: Transition probabilities of homogenous Markov chains do not depend on time.

6. Discrete-time homogenous Markov chains

Recall, the sequence of RVs forms the discrete-time Markov chain if:
P r{S(n + 1) = j|S(n) = i, . . . , S(0) = m} = P r{S(n + 1) = j|S(n) = i}, (24)
• state space is E ∈ {0, 1 . . . , M };

• index set is time T ∈ {0, 1, . . . }.
We consider only homogenous discrete-time Markov chains here, i.e.
P r{S(n + 1) = j|S(n) = i} = pij (25)
Question we are interested:

• what is the holding (sojourn) time in the state?

tr: 1 - 0
tr: 0 - 1 tr: 0 - 2
3 t
1
0 1 2 ...
2
p
1-p F
Figure 6: Consider an arbitrary state F of discrete-time Markov chain..
• p – probability to jump from state F to state F ;

• (1 − p) – probability to jump from state F to any other state.

Analysis:
• assume that the Markov chain is in the state F at certain time;
• Markov chain stays in state F for in the next slot given that currently it is in state F :
P r{S(n + 1) = F |S(n) = F } = 1 − p. (26)
• Markov chain jumps to other state in the next slot given that currently it is in state F :
P r{S(n + 1) 6= F |S(n) = F } = p. (27)
• Markov chain stays in state F for m time units given that currently it is in state F :
P r{S(n + m) = F |S(n) = F } = (1 − p) × (1 − p) × × . . . (1 − p) = (1 − p)m . (28)
• Markov chain stays in state F for m time units and then exit from F :
P r{S(n + m + 1) = F |S(n + m) = F, . . . , S(n) = F } = (1 − p)m p. (29)
Important note:
• the latter gives geometric distribution, which is memoryless in nature.

7. Continuous-time homogenous Markov chains

We consider:
• homogenous continuous-time homogenous Markov chain;
• note: decision whether to jump to other states can be taken in any time instant!
tr: 1 - 0
tr: 0 - 1 tr: 0 - 2
3 t
1
0 1 2 ...
2
Figure 7: Transitions in continuous-time Markov chain.
We are looking for sojourn time in the state.

Let us now:
• tag a time t0 ;
• system at time t0 is in the state i;
• τ is the RV of sojourn time in state i.
t0 s t
Figure 8: Time the process stays in state i.
Consider the following:

• Markov chain stays in state i after some time t = t0 + s;
• distribution of time it stays in state i after time t = t0 + s is the same as after time t = t0 :
P r{τ > s + t|τ > s} = P r{τ > t} (30)
– future evolutions should not depend on the past (Markov property)!

What we have right now:
• distribution of the sojourn time must be memoryless and continuous;
• the only continuous memoryless distribution is the exponential one:
P r{τ ≤ t} = 1 − e−λi t , ∀i, t > 0. (31)
Parameter λ:
• λi is called transition rate out of state i;
• λi is non-negative.
Parameter λ:
• λi = 0: the process always stays in state i;
• 0 < λi < ∞, the probability that process change its state in ∆t is:
λi ∆t. (32)

Consider M state continuous time Markov chain.
ë13 ë 3i ë iM
ë12 ë 23
1 2 3 M
ë 21 ë 32
ë 31 ë i3 ë Mi
Figure 9: M -state continuous time Markov chain.
Transition rate from state i to any other state j is given by:

pij (∆t)
λij = lim . (33)
∆t→0 ∆t

Let now i = j:
pii (∆t)
λii = lim . (34)
∆t→0 ∆t
To balance it out we should have:
X
λii = − λij , ∀i. (35)
i,i6=j
• often λii is denoted simply by λi ;

P
• note that 1/ j,j6=i λij is the mean sojourn time (exponential holding time in the state).
Transition rates can be structured into the transition rate matrix:
 P 
− j,j6=1 λ1j λ12 λ13 · · · λ1M
 P 
 λ21 − j,j6=2 λ2j λ23 · · · λ2M 
Λ= (36)
 
.. .. .. .. .. 

 . . . . . 

P
λM 1 λM 2 λM 3 · · · − j,j6=M λM j
• this matrix is also called infinitesimal generator of the continuous-time Markov chain.

8. Classification of states
States of Markov chain is classified to:
• recurrent:
– assume that the process leaves a certain state;
– if it may return to this state after some time with probability 1.
• transient:
– assume that the process leaves a certain state;
– if it may not return to this state (return with probability less than 1).
• absorbing:
– assume that the process enters a certain state;
– if it cannot visit any other state.
Note: any state is one of the above.

p21>0 p32>0 p43>0

1 2 3 4
p12>0 p23>0 p34>0
Figure 10: All states are recurrent.
p21>0 p32>0 p43=0

1 2 3 4
p12>0 p23>0 p34>0
Figure 11: States 1, 2, 3 are transient, while state 4 is absorbing.

Do the following:
• fj (n) be the probability that the process returns to state j after n steps after leaving state j;
• define the following quantities for fj (n):
∞
X ∞
X
fj = fj (n), E[fj ] = nfj (n), (37)
n=1 n=1
• fj : probability that the process returns to the state j some time after leaving it;
• E[fj ]: mean number of steps needed to return to state j after leaving it:
– it is also called mean recurrence time for state j.
Therefore, the state is:

• recurrent if fj = 1;
• transient if fj < 1.

Based on the mean recurrence time the recurrent states are:
• positive recurrent:
– if the mean recurrence time is finite: E[fj ] < ∞.
• null recurrent:
– if the mean recurrence time is infinite: E[fj ] → ∞.
Based on the properties of mean recurrence time we distinguish between:

• periodic states:
– if the recurrence time has a period α, α > 1;
– it means that the only possible steps in which state may occur is α, 2α, 3α, . . . .
• aperiodic ones:
– these states do not have a period.

Definition: recurrent state is ergodic if it is positive recurrent and aperiodic:
E[fj ] < ∞. (38)
Definition: A Markov chain is called ergodic if all its states are ergodic.
Definition: A Markov chain is called irreducible if either:
• all states are transient or;
• all states are recurrent null;
• all states are ergodic: if recurrent positive and aperiodic;
• if periodic: all states have the same period.
Definition: aperiodic irreducible Markov chain with finite number of states is always ergodic.
Important notes:
• Markov chain is simply a class of stochastic processes;
• for Markov chain ergodicity means that there should be stationary state probabilities!

9. Ergodic Markov chains

• ergodic homogenous Markov chain;
• for such a chain steady-state probabilities exist and given by:
pj = lim P r{S(n) = j}, (39)

n→∞
Important notes:
• steady-state probabilities are independent of the initial state probabilities;
• one can find them using mean recurrence time as follows:
1
pj = , ∀j. (40)
E[fj ]
Steady-state distribution pi , ∀i, is also called:

• equilibrium state distribution, stationary distribution, etc.

Note the following:
• it is really complicated task to determine E[fj ], ∀j;
• question: are there any other way to determine pj = limn→∞ P r{S(n) = j}?
The stationary distribution of the discrete-time Markov chain is the solution of:
X
pi pij = pj ,
∀i
X
pi = 1, (41)
∀i
In matrix form:
p~P = p~ p~~e = 1. (42)
• ~e is the unit column vector with all components equal to 1.

Note the following: if the system has finite number of states, N , we solve using:
• N − 1 equations from available N balance equations;
• normalization condition.

The stationary distribution of the continuous time Markov chain is the solution of:
X
pi λij = 0,
∀i
X
pi = 1, (43)
∀i
• first equations are balance equations;

• second equation is the normalizing condition.
In matrix form:
p~Λ = 0, p~~e = 1. (44)
• ~e is unit column vector;

• Λ is infinitesimal generator (transition rate matrix).
Note the following: if the system has finite number of states, N , we solve using:
• N − 1 equations from available N balance equations;
• normalization condition.

Reminder of Stochastic Processes: Lecturer: Dmitri A. Moltchanov E-Mail: Moltchan@cs - Tut.fi

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Reminder of Stochastic Processes: Lecturer: Dmitri A. Moltchanov E-Mail: Moltchan@cs - Tut.fi

Загружено:

Авторское право:

Доступные форматы

Reminder of stochastic processes

Lecturer: Dmitri A. Moltchanov

Lecture: Reminder of stochastic processes 2

1. Definitions and notions

A set of random variables indexed by some parameter t, S(t) ∈ E:

• is called a stochastic process, where E is the state space of the process;

Lecture: Reminder of stochastic processes 3

• we are given a discrete-time stochastic process.

T = < = (−∞, ∞),

• we are given a continuous-time stochastic process.

Lecture: Reminder of stochastic processes 4

• we are given discrete-space stochastic process.

E = < = (0, ∞),

• we are given continuous-space stochastic process.

Lecture: Reminder of stochastic processes 5

Figure 1: Observations of discrete-time discrete-space stochastic process.

Lecture: Reminder of stochastic processes 6

1.1. Sections and trajectories

Figure 2: Realizations and sections of stochastic process.

• fix certain curve: we get a trajectory (realization) of stochastic process;

Lecture: Reminder of stochastic processes 7

2. Characteristics of stochastic processes

Similarly, we can characterize an arbitrary section:

F (t, s) = P r{S(t) ≤ s}. (6)

• depends on both t and s;

F (t, s) does not completely characterize the stochastic process:

Lecture: Reminder of stochastic processes 8

S(t1 , s1 , t2 , s2 ) = P r{S(t1 ) ≤ s1 , S(t2 ) ≤ s2 }. (7)

S(t1 , s1 , t2 , s2 , . . . ) = P r{S(t1 ) ≤ s1 , S(t2 ) ≤ s2 , . . . }. (8)

We observe the following:

Lecture: Reminder of stochastic processes 9

Figure 3: Mean of stochastic process {S(t), t ∈ T } (denoted by black line).

Lecture: Reminder of stochastic processes 10

One can similarly define:

Lecture: Reminder of stochastic processes 11

2.3. Autocorrelation function

Lecture: Reminder of stochastic processes 12

Lecture: Reminder of stochastic processes 13

3.2. Based on the type of stationarity

Note, that strictly stationary processes are also called:

Lecture: Reminder of stochastic processes 14

3.3. Based on memory

3.4. Based on ergodic property

Lecture: Reminder of stochastic processes 15

4. Ergodic stationary processes

Variance of stationary ergodic process is given by:

Lecture: Reminder of stochastic processes 16

Sufficient condition of ergodicity is given by:

Summarizing ergodic property:

Practically, process is not stationary ergodic:

Lecture: Reminder of stochastic processes 17

P r{S(tn+1 ) = sn+1 |S(tn ) = sn , . . . , S(tn−1 ) = sn−1 } = P r{S(tn+1 ) = sn+1 |S(tn ) = sn }. (18)

– called Markov property;

Lecture: Reminder of stochastic processes 18

5.1. Description and time-homogeneity

0 … n n+1 n+2 n+3 t 0 … n n+1 n+2 n+3 t

Figure 4: Probability to go in one step at time instants n and n + 1.

A Markov process at time n is fully defined by:

A Markov process at time (n + 1) is fully defined by:

Lecture: Reminder of stochastic processes 19

pij (n + m) = P r{S(n + m + 1) = j|S(n + m) = i}, i, j ∈ E. (21)

Lecture: Reminder of stochastic processes 20