Вы находитесь на странице: 1из 37

Reminder of stochastic processes

Lecturer: Dmitri A. Moltchanov


E-mail: moltchan@cs.tut.fi

http://www.cs.tut.fi/kurssit/ELT-53606/
Network analysis and dimensioning I D.Moltchanov, TUT, 2013
OUTLINE:
• Definition and basic notions;
• Characterization of stochastic processes;
• Classifications of stochastic processes;
• Ergodic processes;
• Markov processes;
• Discrete-time homogenous Markov chains;
• Continuous-time homogenous Markov chains;
• Classification of states;
• Ergodic chains and stationary distribution.

Lecture: Reminder of stochastic processes 2


Network analysis and dimensioning I D.Moltchanov, TUT, 2013

1. Definitions and notions


Assume the following:
• (Ω, F, P ) is the probability space;
• there are random variables defined on this space.

A set of random variables indexed by some parameter t, S(t) ∈ E:

S = {S(t), t ∈ T } (1)

• is called a stochastic process, where E is the state space of the process;


• in other words for each t ∈ T , S(t) is a mapping from Ω to some set E.

Examples of processes:
• packet arrival process to IP router;
• call arrival process to telephone exchange, etc.

Lecture: Reminder of stochastic processes 3


Network analysis and dimensioning I D.Moltchanov, TUT, 2013
DEFINING SET T :
• set T is the index set of stochastic process;
• set T is often (not always!) referred to as time.

If T is countable:

T = ℵ = {0, 1, . . . },
T = Z = {. . . , −2, −1, 0, 1, 2, . . . } (2)

• we are given a discrete-time stochastic process.

If T is not countable:

T = < = (−∞, ∞),


T = (0, ∞), (3)

• we are given a continuous-time stochastic process.

Lecture: Reminder of stochastic processes 4


Network analysis and dimensioning I D.Moltchanov, TUT, 2013
DEFINING SET E:
• set E is called state space of the stochastic process {S(t), t ∈ T }.

If E is countable:

E = ℵ = {0, 1, . . . },
E = Z = {. . . , −2, −1, 0, 1, 2, . . . } (4)

• we are given discrete-space stochastic process.

If E is not countable:

E = < = (0, ∞),


E = (−∞, ∞). (5)

• we are given continuous-space stochastic process.

Lecture: Reminder of stochastic processes 5


Network analysis and dimensioning I D.Moltchanov, TUT, 2013

Space E
4
3 .10

4
2.8 .10

4
2.6 .10

4
2.4 .10

4
2.2 .10
10 15 20 25 30 35 40 45 50

Space T

Figure 1: Observations of discrete-time discrete-space stochastic process.

Lecture: Reminder of stochastic processes 6


Network analysis and dimensioning I D.Moltchanov, TUT, 2013

1.1. Sections and trajectories


Do the following:
• observe a certain stochastic process a number of times.
S( t )
10

0
0 1.25 2.5 3.75 5 6.25 7.5 8.75 10 t

Figure 2: Realizations and sections of stochastic process.

• fix certain curve: we get a trajectory (realization) of stochastic process;


• fix certain t from T : we get a section of stochastic process.

Lecture: Reminder of stochastic processes 7


Network analysis and dimensioning I D.Moltchanov, TUT, 2013

2. Characteristics of stochastic processes


Observe the following:
• it is still unclear how to formally characterize a stochastic processes;
• recall: any RV can be fully characterized by its CDF (pdf or pmf).

Similarly, we can characterize an arbitrary section:

F (t, s) = P r{S(t) ≤ s}. (6)

• depends on both t and s;


• is called one-dimensional distribution function of {S(t), t ∈ T }.

F (t, s) does not completely characterize the stochastic process:


• example: what if there is dependence between subsequent sections?

Lecture: Reminder of stochastic processes 8


Network analysis and dimensioning I D.Moltchanov, TUT, 2013
More information: two-dimensional distribution function:

S(t1 , s1 , t2 , s2 ) = P r{S(t1 ) ≤ s1 , S(t2 ) ≤ s2 }. (7)

By induction:

S(t1 , s1 , t2 , s2 , . . . ) = P r{S(t1 ) ≤ s1 , S(t2 ) ≤ s2 , . . . }. (8)

We observe the following:


• full description is given by joint distribution of all its sections;
• it is not easy to deal with such description.
What we usually do in practice:
• we usually do not use general processes:
– example: Markov processes: two-dimensional distribution is sufficient.
• we may also limit description of processes to moments:
– example: correlation theory: mean and autocorrelation function.

Lecture: Reminder of stochastic processes 9


Network analysis and dimensioning I D.Moltchanov, TUT, 2013

2.1. Mean
Mean of the stochastic process {S(t), t ∈ T }:
• non-probabilistic function ms (t);
• for all t ∈ T equals to the mean of corresponding section:
Z ∞
ms (t) = xs(t, x)dx. (9)
−∞

S( t )
10

0
0 1.25 2.5 3.75 5 6.25 7.5 8.75 10 t

Figure 3: Mean of stochastic process {S(t), t ∈ T } (denoted by black line).

Lecture: Reminder of stochastic processes 10


Network analysis and dimensioning I D.Moltchanov, TUT, 2013

2.2. Variance
Variance of the stochastic process {S(t), t ∈ T }:
• non-probabilistic function Ds (t);
• for all t ∈ T equals to the variance of corresponding section:
Z ∞
Ds (t) = (x − ms (t))2 s(t, x)dx. (10)
−∞

One can similarly define:


• excess;
• kurtosis;
• higher moments and central moments.
Important:
• these moments characterize sections in isolations!
• we still need descriptor of dependence.

Lecture: Reminder of stochastic processes 11


Network analysis and dimensioning I D.Moltchanov, TUT, 2013

2.3. Autocorrelation function


Autocorrelation function (ACF):
• non-probabilistic function Ks (t1 , t2 );
• for all pairs t1 , t2 ∈ T is just a covariance of corresponding sections:
Ks (t1 , t2 ) = E[S(t1 ), S(t2 )] − mx (t1 )mx (t2 ). (11)
– recall: this is a measure of linear dependence between sections.
Normalized autocorrelation function (NACF):
• non-probabilistic function Rs (t1 , t2 );
• for all pairs t1 , t2 ∈ T is just a correlation coefficient of corresponding sections:
E[S(t1 ), S(t2 )] − ms (t1 )ms (t2 )
Rs (t1 , t2 ) = p . (12)
Ds (t1 )Ds (t2 )
Important notes: NACF and ACF can be used interchangeably:
• since −1 ≤ Rs (t1 , t2 ) ≤ 1 NACF is often preferable;
• ACF includes variance as Ks (t1 , t1 )!

Lecture: Reminder of stochastic processes 12


Network analysis and dimensioning I D.Moltchanov, TUT, 2013

3. Classifications
3.1. Based on stationarity
General classification:
• stationary processes:
– at least mean and ACF do not depend on time.
• non-stationary processes:
– mean or ACF or both depend on time;
– practically, they may change in time.
Note: stationarity is advantageous property:
• correlation theory is quite developed;
• we are limited to first two moments (K(ti , ti ) = D(ti )) and linear dependence!
Notes on non-stationary processes:
• very hard to deal with;
• it seems that most processes observed in networks are somewhat non-stationary!

Lecture: Reminder of stochastic processes 13


Network analysis and dimensioning I D.Moltchanov, TUT, 2013

3.2. Based on the type of stationarity


Stochastic process can be:
• strictly stationary:
– n-dimensional distribution function does not changes with the time shift τ for all n;
– for (t1 , t2 , . . . , tn ) and (t1 + τ, t2 + τ, . . . , tn + τ ): n-dimensional distribution is the same.
• covariance stationary processes:
– mean is constant in time;
– ACF depends on the time shift only:
Ks (t1 , t2 ) = Ks (τ ), τ = |t2 − t1 |. (13)

Note, that strictly stationary processes are also called:


• first-order stationary processes.
Note, that covariance stationary processes are also called:
• weakly stationary processes;
• second-order stationary processes.

Lecture: Reminder of stochastic processes 14


Network analysis and dimensioning I D.Moltchanov, TUT, 2013

3.3. Based on memory


Stochastic process can be:
• with one-step ahead dependence:
– called Markov processes;
– named after A.A. Markov;
– can be completely characterized by the set of two-dimensional distributions;
– most important processes for many applied fields due to analytical tractability.
• other processes.

3.4. Based on ergodic property


Stochastic process can be:
• ergodic processes;
• not ergodic ones.

Lecture: Reminder of stochastic processes 15


Network analysis and dimensioning I D.Moltchanov, TUT, 2013

4. Ergodic stationary processes


The main property:
• single trajectory gives all information regarding characteristics of the process;
• for example: mean, variance, ACF, etc.
Notes on ergodic property:
• a trajectory must be observed for long enough...
• how long is long enough?...
Mean of stationary ergodic process is given by:
Z T
1
ms = E[S(t)] = lim S(t)dt. (14)
t→∞ 2T −T

Variance of stationary ergodic process is given by:


Z T
1
Ds = D[S(t)] = lim (S(t) − ms )2 dt. (15)
t→∞ 2T −T

Lecture: Reminder of stochastic processes 16


Network analysis and dimensioning I D.Moltchanov, TUT, 2013
ACF of stationary ergodic process is given by:
Z T
1
Ks (τ ) = lim (S(t) − ms )(S(t − τ ) − ms )dt. (16)
t→∞ 2T −T

Sufficient condition of ergodicity is given by:

lim Ks (τ ) = 0. (17)
τ →∞

Summarizing ergodic property:


• allows to easily compute characteristics using a single realization of the process;
• one have to be sure that the process you observe is ergodic. Tough do to!!!

Practically, process is not stationary ergodic:


• when some type of heterogeneity is found...
• could be deterministic influence at some instants of time, etc.

Lecture: Reminder of stochastic processes 17


Network analysis and dimensioning I D.Moltchanov, TUT, 2013

5. Markov processes
Historical facts:
• A.A. Markov is Russian mathematician (1856 − 1922):
– first results in 1906 (discrete-space Markov processes);
– extension of the law of large numbers for dependent events;
– contributors: Kolmogorov, Khinchine, Fokker, Plank, Uhlenbeck, Wiener, Feller,..
• note: classic queuing theory is mostly based on Markovian processes!
Assume the following:
• we are given a discrete-time, discrete-space process {S(t), t ∈ T }.
• stochastic process {S(t), t ∈ T } is called Markov process if it satisfies:

P r{S(tn+1 ) = sn+1 |S(tn ) = sn , . . . , S(tn−1 ) = sn−1 } = P r{S(tn+1 ) = sn+1 |S(tn ) = sn }. (18)

– called Markov property;


– also called memoryless property, one-step memory, memoryless process, etc.

Lecture: Reminder of stochastic processes 18


Network analysis and dimensioning I D.Moltchanov, TUT, 2013

5.1. Description and time-homogeneity


states states
6 p36(n) 6 p36(n+1)
5 p35(n) 5 p35(n+1)
4 p34(n) 4 p34(n+1)
3 p33(n) 3 p33(n+1)
2 p32(n) 2 p32(n+1)
1 p31(n) 1 p31(n+1)

0 … n n+1 n+2 n+3 t 0 … n n+1 n+2 n+3 t

Figure 4: Probability to go in one step at time instants n and n + 1.

A Markov process at time n is fully defined by:


pij (n) = P r{S(n + 1) = j|S(n) = i}, i, j ∈ E. (19)

A Markov process at time (n + 1) is fully defined by:


pij (n + 1) = P r{S(n + 2) = j|S(n + 1) = i}, i, j ∈ E. (20)

Lecture: Reminder of stochastic processes 19


Network analysis and dimensioning I D.Moltchanov, TUT, 2013
Similarly, a Markov process at time (n + m) is fully defined by:

pij (n + m) = P r{S(n + m + 1) = j|S(n + m) = i}, i, j ∈ E. (21)

We may use matrix for one-step transitions from i to j between times n and (n + 1):
 
p11 (n) p12 (n) p13 (n) · · · p1M (n)
 
 p (n) p (n) p (n)
 21 22 23 · · · p2M (n) 
 M
  X
P (n) = 
 p31 (n) p32 (n) p33 (n) ,
· · · p3M (n)  pji (n) = 1, ∀j, ∀n. (22)
 . .. .. ... .. i=1
 ..

 . . . 

pM 1 (n) pM 2 (n) pM 3 (n) · · · pM M (n)

Definitions:
• Markov chain whose transition probabilities depend on time is non-homogenous;
• Markov chain whose transition probabilities do not depend on time is homogenous.

Lecture: Reminder of stochastic processes 20


Network analysis and dimensioning I D.Moltchanov, TUT, 2013
We can drop the dependence on time for these chains and write:

pij = P r{S(n + 1) = j|S(n) = i}, (23)

• as the transition probabilities from state i to state j;


• indices n and (n + 1) are used to denote one-step transition probabilities.

states states
6 p26 p26 6 p36 p36
5 p25 p25 5 p35 p35
4 p24 p24 4 p34 p34
3 p23 p23 3 p33 p33
2 p22 p22 2 p32 p32
1 p21 p21 1 p31 p31

0 … n n+1 n+2 n+3 t 0 … n n+1 n+2 n+3 t

Figure 5: Transition probabilities of homogenous Markov chains do not depend on time.

Lecture: Reminder of stochastic processes 21


Network analysis and dimensioning I D.Moltchanov, TUT, 2013

6. Discrete-time homogenous Markov chains


Recall, the sequence of RVs forms the discrete-time Markov chain if:

P r{S(n + 1) = j|S(n) = i, . . . , S(0) = m} = P r{S(n + 1) = j|S(n) = i}, (24)

• state space is E ∈ {0, 1 . . . , M };


• index set is time T ∈ {0, 1, . . . }.

We consider only homogenous discrete-time Markov chains here, i.e.

P r{S(n + 1) = j|S(n) = i} = pij (25)

Question we are interested:


• what is the holding (sojourn) time in the state?

Lecture: Reminder of stochastic processes 22


Network analysis and dimensioning I D.Moltchanov, TUT, 2013

tr: 1 - 0
tr: 0 - 1 tr: 0 - 2

3 t

1
0 1 2 ...
2

p
1-p F

Figure 6: Consider an arbitrary state F of discrete-time Markov chain..

• p – probability to jump from state F to state F ;


• (1 − p) – probability to jump from state F to any other state.

Lecture: Reminder of stochastic processes 23


Network analysis and dimensioning I D.Moltchanov, TUT, 2013
Analysis:
• assume that the Markov chain is in the state F at certain time;
• Markov chain stays in state F for in the next slot given that currently it is in state F :

P r{S(n + 1) = F |S(n) = F } = 1 − p. (26)

• Markov chain jumps to other state in the next slot given that currently it is in state F :

P r{S(n + 1) 6= F |S(n) = F } = p. (27)

• Markov chain stays in state F for m time units given that currently it is in state F :

P r{S(n + m) = F |S(n) = F } = (1 − p) × (1 − p) × × . . . (1 − p) = (1 − p)m . (28)

• Markov chain stays in state F for m time units and then exit from F :

P r{S(n + m + 1) = F |S(n + m) = F, . . . , S(n) = F } = (1 − p)m p. (29)

Important note:
• the latter gives geometric distribution, which is memoryless in nature.

Lecture: Reminder of stochastic processes 24


Network analysis and dimensioning I D.Moltchanov, TUT, 2013

7. Continuous-time homogenous Markov chains


We consider:
• homogenous continuous-time homogenous Markov chain;
• note: decision whether to jump to other states can be taken in any time instant!

tr: 1 - 0
tr: 0 - 1 tr: 0 - 2

3 t

1
0 1 2 ...
2

Figure 7: Transitions in continuous-time Markov chain.

We are looking for sojourn time in the state.

Lecture: Reminder of stochastic processes 25


Network analysis and dimensioning I D.Moltchanov, TUT, 2013
Let us now:
• tag a time t0 ;
• system at time t0 is in the state i;
• τ is the RV of sojourn time in state i.

t0 s t

Figure 8: Time the process stays in state i.

Consider the following:


• Markov chain stays in state i after some time t = t0 + s;
• distribution of time it stays in state i after time t = t0 + s is the same as after time t = t0 :

P r{τ > s + t|τ > s} = P r{τ > t} (30)

– future evolutions should not depend on the past (Markov property)!

Lecture: Reminder of stochastic processes 26


Network analysis and dimensioning I D.Moltchanov, TUT, 2013
What we have right now:
• distribution of the sojourn time must be memoryless and continuous;
• the only continuous memoryless distribution is the exponential one:

P r{τ ≤ t} = 1 − e−λi t , ∀i, t > 0. (31)

Parameter λ:
• λi is called transition rate out of state i;
• λi is non-negative.

Parameter λ:
• λi = 0: the process always stays in state i;
• 0 < λi < ∞, the probability that process change its state in ∆t is:

λi ∆t. (32)

Lecture: Reminder of stochastic processes 27


Network analysis and dimensioning I D.Moltchanov, TUT, 2013
Consider M state continuous time Markov chain.

ë13 ë 3i ë iM
ë12 ë 23
1 2 3 M
ë 21 ë 32
ë 31 ë i3 ë Mi

Figure 9: M -state continuous time Markov chain.

Transition rate from state i to any other state j is given by:


pij (∆t)
λij = lim . (33)
∆t→0 ∆t

Lecture: Reminder of stochastic processes 28


Network analysis and dimensioning I D.Moltchanov, TUT, 2013
Let now i = j:
pii (∆t)
λii = lim . (34)
∆t→0 ∆t
To balance it out we should have:
X
λii = − λij , ∀i. (35)
i,i6=j

• often λii is denoted simply by λi ;


P
• note that 1/ j,j6=i λij is the mean sojourn time (exponential holding time in the state).
Transition rates can be structured into the transition rate matrix:
 P 
− j,j6=1 λ1j λ12 λ13 · · · λ1M
 P 
 λ21 − j,j6=2 λ2j λ23 · · · λ2M 
Λ= (36)
 
.. .. .. .. .. 

 . . . . . 

P
λM 1 λM 2 λM 3 · · · − j,j6=M λM j

• this matrix is also called infinitesimal generator of the continuous-time Markov chain.

Lecture: Reminder of stochastic processes 29


Network analysis and dimensioning I D.Moltchanov, TUT, 2013

8. Classification of states
States of Markov chain is classified to:
• recurrent:
– assume that the process leaves a certain state;
– if it may return to this state after some time with probability 1.
• transient:
– assume that the process leaves a certain state;
– if it may not return to this state (return with probability less than 1).
• absorbing:
– assume that the process enters a certain state;
– if it cannot visit any other state.

Note: any state is one of the above.

Lecture: Reminder of stochastic processes 30


Network analysis and dimensioning I D.Moltchanov, TUT, 2013

p21>0 p32>0 p43>0


1 2 3 4
p12>0 p23>0 p34>0

Figure 10: All states are recurrent.

p21>0 p32>0 p43=0


1 2 3 4
p12>0 p23>0 p34>0

Figure 11: States 1, 2, 3 are transient, while state 4 is absorbing.

Lecture: Reminder of stochastic processes 31


Network analysis and dimensioning I D.Moltchanov, TUT, 2013
Do the following:
• fj (n) be the probability that the process returns to state j after n steps after leaving state j;
• define the following quantities for fj (n):


X ∞
X
fj = fj (n), E[fj ] = nfj (n), (37)
n=1 n=1

• fj : probability that the process returns to the state j some time after leaving it;
• E[fj ]: mean number of steps needed to return to state j after leaving it:
– it is also called mean recurrence time for state j.

Therefore, the state is:


• recurrent if fj = 1;
• transient if fj < 1.

Lecture: Reminder of stochastic processes 32


Network analysis and dimensioning I D.Moltchanov, TUT, 2013
Based on the mean recurrence time the recurrent states are:
• positive recurrent:
– if the mean recurrence time is finite: E[fj ] < ∞.
• null recurrent:
– if the mean recurrence time is infinite: E[fj ] → ∞.

Based on the properties of mean recurrence time we distinguish between:


• periodic states:
– if the recurrence time has a period α, α > 1;
– it means that the only possible steps in which state may occur is α, 2α, 3α, . . . .
• aperiodic ones:
– these states do not have a period.

Lecture: Reminder of stochastic processes 33


Network analysis and dimensioning I D.Moltchanov, TUT, 2013
Definition: recurrent state is ergodic if it is positive recurrent and aperiodic:

E[fj ] < ∞. (38)

Definition: A Markov chain is called ergodic if all its states are ergodic.
Definition: A Markov chain is called irreducible if either:
• all states are transient or;
• all states are recurrent null;
• all states are ergodic: if recurrent positive and aperiodic;
• if periodic: all states have the same period.
Definition: aperiodic irreducible Markov chain with finite number of states is always ergodic.
Important notes:
• Markov chain is simply a class of stochastic processes;
• for Markov chain ergodicity means that there should be stationary state probabilities!

Lecture: Reminder of stochastic processes 34


Network analysis and dimensioning I D.Moltchanov, TUT, 2013

9. Ergodic Markov chains


Assume the following:
• ergodic homogenous Markov chain;
• for such a chain steady-state probabilities exist and given by:

pj = lim P r{S(n) = j}, (39)


n→∞

Important notes:
• steady-state probabilities are independent of the initial state probabilities;
• one can find them using mean recurrence time as follows:
1
pj = , ∀j. (40)
E[fj ]

Steady-state distribution pi , ∀i, is also called:


• equilibrium state distribution, stationary distribution, etc.

Lecture: Reminder of stochastic processes 35


Network analysis and dimensioning I D.Moltchanov, TUT, 2013
Note the following:
• it is really complicated task to determine E[fj ], ∀j;
• question: are there any other way to determine pj = limn→∞ P r{S(n) = j}?
The stationary distribution of the discrete-time Markov chain is the solution of:
X
pi pij = pj ,
∀i
X
pi = 1, (41)
∀i

In matrix form:

p~P = p~ p~~e = 1. (42)

• ~e is the unit column vector with all components equal to 1.


Note the following: if the system has finite number of states, N , we solve using:
• N − 1 equations from available N balance equations;
• normalization condition.

Lecture: Reminder of stochastic processes 36


Network analysis and dimensioning I D.Moltchanov, TUT, 2013
The stationary distribution of the continuous time Markov chain is the solution of:
X
pi λij = 0,
∀i
X
pi = 1, (43)
∀i

• first equations are balance equations;


• second equation is the normalizing condition.
In matrix form:

p~Λ = 0, p~~e = 1. (44)

• ~e is unit column vector;


• Λ is infinitesimal generator (transition rate matrix).
Note the following: if the system has finite number of states, N , we solve using:
• N − 1 equations from available N balance equations;
• normalization condition.

Lecture: Reminder of stochastic processes 37

Вам также может понравиться