Вы находитесь на странице: 1из 31

Chapter II.

Random Signals
1. Definitions
Random variables

I A random variable is a variable that holds a value produced by a


(partially) random phenomenon (experiment)
I Typically denoted as X , Y etc..
I Examples:
I The value of a dice
I The value of the voltage in a circuit
I We get a single value, but
I The opposite = a constant value
Sample space and realizations

I A realization = a single outcome of the random experiment


I Sample space Ω = the set of all values that can be taken by a
random variable X
I i.e. the set of all possible realizations
I Example: rolling a dice
I we might get a realization X = 3
I but we could have got any value from the sample space

Ω = {1, 2, 3, 4, 5, 6}

I If Ω is a discrete set –> discrete random variable


I Example: value of a dice
I If Ω is a continuous set –> continuous random variable
I Example: a voltage value
Discrete random variable

I For a discrete random variable, the probability that X has xi is given


by the probability mass function (PMF) w (xi )

wX (xi ) = P {X = xi }

I Example: what is the PMF of a dice?


I For simplicity we will call it simply the “distribution” of X
I The cumulative distribution function (CDF) gives the probability
that the value of X is smaller or equal than the argument xi

FX (xi ) = P {X ≤ xi }

I In Romanian: “functie de repartitie”


I Example: draw the CDF of a dice (= a staircase function)
Continuous random variable

I The CDF of a continuous r.v. is in the same way:

FX (xi ) = P {X ≤ xi }

I The derivative of the CDF is the probability density function


(PDF)
dFX (xi
wX (xi ) =
dxi
I The PDF gives the probability that the value of X is in a small
vicinity of around xi
I Important: the probability that a continuous r.v. X is exactly equal
to a value xi is zero
I because there are an infinity of possibilities (continuous)
I That’s why we can’t define a probability mass function like for discrete
Probability and distribution

I Compute probability from PDF (continuous r.v.):


Z B
P {A ≤ X ≤ B} = wX (x )dx
A

I Compute probability from PMF (discrete r.v.):


B
X
P {A ≤ X ≤ B} = wX (x )
x =A

I Probability that a r.v. X is between A and B is the area below the


PDF
Properties of PDF/PMF/CDF

I The CDF is monotonously increasing (non-decreasing)


I The PDF/PMF are always ≥ 0
I The CDF starts from 0 and goes up to 1
I Integral/sum over all of the PDF/PMF = 1
I Some others, mention when needed
Examples

I Gaussian PDF
I Uniform PDF
I ...
Multiple random variables

I Consider a system with two random variables X and Y


I Joint cumulative distribution function:

FXY (xi , yj ) = P {X ≤ xi ∩ Y ≤ yi }

I Joint probability density function:

∂ 2 PXY (xi , yj )
wXY (xi , yj ) =
∂x ∂y
I The joint PDF gives the probability that the values of the two r.v. X
and Y are in a vicinity of xi and yi simultaneously
I Similar definitions extend to the case of discrete random variables
Random process

I A random process = a sequence of random variables indexed in time


I Discrete-time random process f [n] = a sequence of random variables
at discrete moments of time
I e.g.: a sequence 50 of throws of a dice, the daily price on the stock
market
I Continuous-time random process f (t) = a continuous sequence of
random variables at every moment
I e.g.: a noise voltage signal, a speech signal
I Every sample from a random process is a (different) random variable!
I e.g. f (t0 ) = value at time t0 is a r.v.
Realizations of random processes

I A realization of the random process = a particular sequence of values


I e.g. we see a given noise signal on the oscilloscope, but we could have
seen any other realization just as well
I When we consider a random process = we consider the set of all
possible realizations
I Example: draw on whiteboard
Distributions of order 1 of random processes

I Every sample f (t1 )from a random process is a random variable


I with CDF F1 (xi ; t1 )
dF1 (xi ;t1 )
I with PDF w1 (xi ; t1 ) = dxi
I The sample at time t2 is a different random variable with possibly
different functions
I with CDF F1 (xi ; t2 )
dF1 (xi ;t2 )
I with PDF w1 (xi ; t2 ) = dxi
I These functions specify how the value of one sample is distributed
I The index w1 indicates we consider a single random variable from the
process –> distributions of order 1
Distributions of order 2

I A pair of random variables f (t1 ) and f (t2 ) sampled from the random
process f (t) have
I joint CDF F2 (xi , xj ; t1 , t2 )
∂ 2 F2 (xi ,xj ;t1 ,t2 )
I joint PDF w2 (xi , xj ; t1 , t2 ) = ∂xi ∂xj
I These functions specify how the pair of values is distributed (are
distributions of order 2)
I Marginal integration
Z ∞
w1 (xi ; t1 ) = w2 (xi , xj ; t1 , t2 )dxj

I (integrate over one variable –> disappears –> only the other one
remains)
Distributions of order n

I Generalize to n samples of the random process


I A set of n random variables f (t1 ), ...f (tn ) sampled from the random
process f (t) have
I joint CDF Fn (x1 , ...xn ; t1 , ...tn )
∂ 2 Fn (x1 ,...xn ;t1 ,...tn )
I joint PDF wn (x1 , ...xn ; t1 , ...tn ) = ∂x1 ...∂xn
I These functions specify how the whole set of n values is distributed
(are distributions of order n)
Statistical averages

We characterize random processes using statistical / temporal averages


(moments)
1. Average value
Z ∞
f (t1 ) = µ(t1 ) = x · w1 (x ; t1 )dx
−∞

2. Average squared value (valoarea patratica medie)


Z ∞
f 2 (t1 ) = x 2 · w1 (x ; t1 )dx
−∞
Statistical averages - variance

3. Variance (= dispersia)
Z ∞
2 2
σ (t1 ) = {f (t1 ) − µ(t1 )} = (x − µ(t1 )2 · w1 (x ; t1 )dx
−∞

I The variance can be computed as:

σ 2 (t1 ) = {f (t1 ) − µ(t1 )}2 = f (t1 )2 − 2f (t1 )µ(t1 ) + µ(t1 )2 = f 2 (t1 )−µ(
Statistical averages - autocorrelation

4. The autocorrelation function


Z ∞ Z ∞
Rff (t1 , t2 ) = f (t1 )f (t2 ) = x1 x2 w2 (x1 , x2 ; t1 , t2 )dx1 dx2
−∞ −∞

5. The correlation function (for different random processes f (t) and


g(t))
Z ∞ Z ∞
Rfg (t1 , t2 ) = f (t1 )g(t2 ) = x1 y2 w2 (x1 , y2 ; t1 , t2 )dx1 dy2
−∞ −∞

I Note 1:
I all these values are calculated across all realizations, at a single time t1
I all these characterize only the r.v. at time t1
I at a different time t2 , the r. v. f (t2 ) is different so all average values
might be different
Temporal averages

I What to do when we only have access to a single realization?


I Compute values for a single realization f (k)(t) , across all time
moments

1. Temporal average value


Z T /2
(k) 1
f (k) (t) =µ = lim f (k) (t)dt
T →∞ T T /2

I This value does not depend on time t

2. Temporal average squared value


Z T /2
1
[f (k) (t)]2 = lim [f (k) (t)]2 dt
T →∞ T −T /2
Temporal variance

3. Temporal variance
Z T /2
2 2 1
(f (k) (t) − µ(k) )2 dt

σ = f (k) (t) − µ(k) = lim
T →∞ T −T /2

I The variance can be computed as:

σ 2 = [f (k) (t)]2 − [µ(k) ]2


Temporal autocorrelation

4. The temporal autocorrelation function

Rff (t1 , t2 ) = f (k) (t1 + t)f (k) (t2 + t)


Z T /2
1
Rff (t1 , t2 ) = lim f (k) (t1 + t)f (k) (t2 + t)dt
T →∞ T −T /2

5. The temporal correlation function (for different random processes f (t)


and g(t))
Rfg (t1 , t2 ) = f (k) (t1 + t)g (k) (t2 + t)
Z T /2
1
Rfg (t1 , t2 ) = lim f (k) (t1 + t)g (k) (t2 + t)dt
T →∞ T −T /2
Stationary random processes

I All the statistical averages are dependent on the time t1


I i.e. they might be different for a sample at t2
I Stationary random process = when statistical averages are identical
upon shifting the time origin (e.g. delaying the signal
I The PDF are identical when shifting the time origin:

wn (x1 , ...xn ; t1 , ...tn ) = wn (x1 , ...xn ; t1 + τ, ...tn + = tau)

I Strictly stationary / strongly stationary / strict-sense stationary:


I relation holds for every n
I Weakly stationary / wide-sense stationary:
I relation holds only for n = 1 and n = 2 (the most used)
Consequences of stationarity
I For n = 1:
w1 (xi ; t1 ) = w1 (xi ; t2 ) = w1 (xi )
I Consequence: the average value, average squared value, variance of a
sample are all identical for any time t
f (t) = constant, ∀t
f 2 (t) = constant, ∀t
σ 2 (t) = constant, ∀t
I For n = 2:
w2 (xi , xj ; t1 , t2 ) = w2 (xi , xj ; 0, t2 − t1 ) = w2 (xi , x2 ; t2 − t1 )
I Consequence: the autocorrelation / correlation functions depend only
on the time difference t2 − t1 between the samples, no matter where
they are located
Rff (t1 , t2 ) = Rff (t2 − t1 ) = Rff (τ )
Rfg (t1 , t2 ) = Rfg (t2 − t1 ) = Rfg (τ )
Ergodic random processes

I In practice, we have access to a single realization


I Ergodic random process = when the temporal averages on any
realization are equal to the statistical averages
I We can compute all averages from a single realization
I the realization must be very long (length → ∞)
I a realization is characteristic of the whole process
I realizations are all similar to the others, statistically
I Most random processes we are about are ergodic and stationary
I e.g. noises
I Example of non-ergodic process:
I throw a dice, then the next 50 values are identical to the first
I a single realization is not characteristic
Practical distributions

I Some of the most encountered probability density functions:


I The uniform distribution U[a, b]
I insert expression here
I The normal (gaussian) distribution N(µ, σ 2 )

1 (x −µ)2
f (x ) = √ e 2σ2
σ 2π

I has average value µ


I has variance (“dispersia”) σ 2
I has the familiar “bell” shape
I variance controls width
I narrower = taller, fatter = shorter
Computation of probabilities for normal functions
Rb
I We sometimes need to compute a of a normal function
I Use the error function:
Z z
2 2
erf (z) = √ e −t dt
π 0

I The cumulative distribution function of a normal distribution N(µ, σ 2 )


1 x −µ
F (X ) = (1 + erf ( √ ))
2 σ 2
I The error function can be simply calculated on Google, e.g. search
erf (0.5)
I Also, we might need:
I erf (−∞) = −1
I erf (∞) = 1
I Examples at blackboard
Properties of the auto-correlation function

I For a stationary random process:

Rff (τ ) = f (t)f (t + τ )

Rff (t1 , t2 ) = Rff (τ = t2 − t1 )


I Is the average value of a product of two samples time τ apart
I Depends on a single value τ = time difference of the two samples
The Wiener-Khinchin theorem

I Rom: teorema Wiener-Hincin


I The Fourier transform of the autocorr function = power
spectral density of the process
Z ∞
Sff (ω) = Rff (τ )e −jωτ dτ
−∞
Z ∞
1
Rff (τ ) = Sff (ω)e jωτ dω
2π −∞
I No proof
I The power spectral density
I tells the average power of the process at every frequency
I Some random processes have low frequencies (they vary rather slowly)
I Some random processes have high frequencies (they vary rather fast)
White noise

I White noise = random process with autocorr function = a Dirac

Rff (τ ) = δ(τ )

I Any two different samples are not correlated


I all samples are absolutely independent one of the other
I Power spectral density = a constant
I has equal power at all frequencies
I In real life, power goes to 0 for very high frequencies
I e.g. samples which are very close are necessarily correlated
I = limited white noise
Properties of the autocorrelation function

1. Is even
Rff (τ ) = Rff (−τ )

I Proof: change variable in definition

2. At infinite it goes to a constant


2
Rff (∞) = f (t) = const

I Proof: two samples separated by ∞ are independent

3. Is maximum in 0
Rff (0) ≥ Rff (τ )

I Proof: start from (f (t) − f (t + τ ))2 ≥ 0


I Interpretation: different samples might vary differently, by a sample is
always identical with itself
Properties of the autocorrelation function

4. Value in 0 = the power of the random process


Z ∞
1
Rff (0) = Sff (ω)dω
2π −∞

I Proof: Put τ = 0 in inverse Fourier transform of Wiener-Khinchin


theorem

5. Variance = difference between values at 0 and ∞

σ 2 = Rff (0) − Rff (∞)

2
I Proof: Rff (0) = f (t)2 , Rff (∞) = f (t)

Вам также может понравиться