Академический Документы
Профессиональный Документы
Культура Документы
Transmitter sends signal with different phase for bits one and
zero.
Now task of detection is to
choose which of the two
signals was sent, 1 or 0?
Unlike aircraft example,
now signals present for
both hypothesis!
- Also, priori probabilities
are known (1/2 and 1/2).
Not so for the radar problem!
Speech Recognition: Application Example
PD = P H1 ; H1 = Prob(x 0 > 𝛾; 𝐻1 )
∞
1 1
=න exp − (t − 1)2 𝑑𝑡
2π 2
𝛾
= 𝑄(𝛾 − 1)
MATLAB:
PD=qfunc(qfuncinv(1E-3)-1) = 0.0183
PD=qfunc(qfuncinv(1E-8)-1) = 1.9941e-006 => very small PD,
the price to pay for low PFA of 1E-8!
Neyman-Pearson Theorem
𝑝 𝑥; 𝐻1
𝐿 𝑥 = >𝛾
𝑝 𝑥; 𝐻0
𝑃𝑓𝑎 = න 𝑝 𝑥; 𝐻0 𝑑𝑥 = 𝛼
{𝑥:𝐿 𝑥 >𝛾)
Neyman-Pearson Theorem: Example
1 1
exp − 𝑥 0 − 1 2
2𝜋 2
𝐿 𝑥 = >𝛾
1 1 2
exp − 𝑥 [0]
2𝜋 2
1 2
exp − 𝑥 0 − 2𝑥 0 + 1 − 𝑥 2 0 >𝛾
2
1
exp 𝑥 0 − >𝛾
2
1
𝑥 0 > log 𝛾 + = 𝛾 ′
2
′
Decide H1 if x[0] > 𝛾 . Same form as in the previous ”ad hoc” example! Now, we
can find 𝛾 ′ with the PFA contraint
∞ 1 1
𝑃𝑓𝑎 = 𝑃𝑟𝑜𝑏 𝑥 0 > 𝛾 ′ ; 𝐻0 = 𝛾′ exp − t 2 𝑑𝑡 = 𝑄 𝛾 ′ = 𝛼.
2π 2
𝛾 ′ = 𝑄 −1 (𝛼)
Also same equation as before! Previous example was optimum in NP sense!
Neyman-Pearson Theorem: More General Example
H1: DC level A (>0) in WGN, N samples
H0: WGN, N samples
1 1 𝑁−1 2
𝑁/2 exp − 2 σ𝑛=0 𝑥 𝑛 − 𝐴
2𝜋𝜎 2 2𝜎
𝐿 𝑥 = >𝛾
1 1 𝑁−1 2
exp − 2 σ𝑛=0 𝑥 𝑛
2𝜋𝜎 2 𝑁/2 2𝜎
𝑁−1
1
− 2 −2𝐴 𝑥 𝑛 + 𝑁𝐴2 > log(𝛾)
2𝜎
𝑛=0
𝑁−1
𝐴 𝑁𝐴2
2
𝑥 𝑛 > log 𝛾 +
𝜎 2𝜎 2
𝑛=0
𝑁−1
1 𝜎2 𝐴
𝑥 𝑛 > log 𝛾 + = 𝛾 ′
𝑁 𝐴𝑁 2
𝑛=0
Compare sample mean (estimate of A) to threshold!
Neyman-Pearson Theorem: More General Example
𝑁−1
1
𝑇 𝑥 = 𝑥 𝑛
𝑁
𝑛=0
𝜎2
H0: T(x) follows Gaussian distribution with mean 0 and variance
𝑁
𝜎2
H1: T(x) follows Gaussian distribution with mean A and variance
𝑁
𝛾′ 𝜎 2
𝑃𝑓𝑎 = 𝑄 ⇒ 𝛾 ′ = 𝑄−1 (𝑃𝑓𝑎 )
𝜎2 𝑁
𝑁
𝛾′ − 𝐴 𝐴 2𝑁
𝑃𝐷 = 𝑄 = 𝑄 𝑄 −1 (𝑃𝑓𝑎 ) −
𝜎2 𝜎2
𝑁
𝐴2 𝑁
ENR =
𝜎2
Neyman-Pearson Theorem: More General Example
1
0.9 PFA=0.1
PFA=0.01
0.8 PFA=0.001
PFA=0.0001
0.7
0.6
PD
0.5
0.4
0.3
0.2
0.1
0
0 5 10 15 20
ENR [dB]
Mean-shifted Gauss-Gauss problem
Assume that we have test statistic 𝑇(𝑥) that follows Gaussian
distribution under H0 and H1
𝑁(𝜇0 , 𝜎 2 ) 𝑢𝑛𝑑𝑒𝑟 𝐻0
𝑇~ ൝
𝑁(𝜇1 , 𝜎 2 ) 𝑢𝑛𝑑𝑒𝑟 𝐻1
It can be shown that PD is obtained with
𝑃𝐷 = 𝑄 𝑄−1 𝑃𝑓𝑎 − 𝑑 2
so the deflection coefficient 𝑑 completely characterizes
performance for Gauss-Gauss problem
2
2
𝜇1 − 𝜇0
𝑑 =
𝜎2
Change in Variance
H0: WGN with variance 𝜎02 , N samples
H1: WGN with variance 𝜎12 (> 𝜎02 ), N samples
1 1 𝑁−1 2
𝑁/2 exp − 2 σ𝑛=0 𝑥 𝑛
2𝜋𝜎12 2𝜎1
𝐿 𝑥 = >𝛾
1 1 𝑁−1 2
σ
𝑁/2 exp − 2𝜎 2 𝑛=0 𝑥 𝑛
2𝜋𝜎02 0
𝑁−1
𝑁 2
𝑁 2
1 1 1 2
− log 𝜎1 + log 𝜎0 − 2 − 2 𝑥 𝑛 > log(𝛾)
2 2 2 𝜎1 𝜎0
𝑛=0
𝑁−1 2log(𝛾) 2 2
1 + log 𝜎1 − log 𝜎 0
2 𝑁
𝑥 𝑛 > = 𝛾′
𝑁 1 1
2− 2
𝑛=0
𝜎0 𝜎1
So we calculate estimate of variance and if bigger than threshold we decide H1!
Receiver Operating Characteristics (ROC)
Assume that we have expression for PFA and PD as function
of threshold. Now vary threshold from –Inf to +Inf and
observe the (PFA,PD) pairs. Plot all observed (PFA,PD) pairs
into plot with x-axis being PFA and y-axis being PD.
0.9
0.8 ENR = 10 dB
0.7
0.6
PD
0.5 ENR = 0 dB
0.4
0.3
ROC is always
0.2 above this 45 angle line!!!
0.1
0
0 0.2 0.4 0.6 0.8 1
PFA
3.5. Irrelevant Data
Irrelevant data may be discarded, it does not affect likelihood
ratio test (LRT) of NP theorem
But be careful about which data is really irrelevant!
Consider DC level estimation in WGN and assume that we
observe reference noise samples wR[n] for n=0,1,2,…,N-1.
Observed data set is {x[0],x[1], …, x[N-1], wR[0], wR[1], …,
wR[N-1]}. If x[n]=w[n] under H0 and x[n]=A+w[n] under H1
and wR[n]=w[n] under both hypothesis, then actually wR[n]
can be used to cancel out noise!
T = x[0] – wR[0] = A for H1 and 0 for H0
So detector using T > A/2 will give perfect detection!!!
3.5. Irrelevant Data
As another example lets consider the following signal model
𝐻0 : 𝑥 𝑛 = 𝑤 𝑛 , 𝑛 = 0,1, ⋯ , 2𝑁 − 1
𝐴 + 𝑤[𝑛] 𝑛 = 0,1, ⋯ , 𝑁 − 1
𝐻1 : 𝑥 𝑛 = ቊ
𝑤[𝑛] 𝑛 = 𝑁, 𝑁 + 1, ⋯ , 2𝑁 − 1
𝑻 𝑻 T
So observed vector 𝒙 = 𝒙𝟏 𝒙𝟐 , where 𝐱𝟏
denotes the first
N samples and 𝐱𝟐 the rest of the samples.
1 1 𝑁−1 2 1 1 2𝑁−1 2
𝑁/2 exp − 2 σ𝑛=0 𝑥 𝑛 − 𝐴 exp − 2 σ𝑛=𝑁 𝑥 𝑛
2𝜋𝜎 2 2𝜎 2𝜋𝜎 2 𝑁/2 2𝜎
>𝛾
1 1 𝑁−1 2 1 1 2𝑁−1 2
exp − 2 σ𝑛=0 𝑥 𝑛 exp − 2 σ𝑛=𝑁 𝑥 𝑛
2𝜋𝜎 2 𝑁/2 2𝜎 2𝜋𝜎 2 𝑁/2 2𝜎
1 1 𝑁−1 2
𝑁 exp − σ 𝑥 𝑛 −𝐴 𝑝(𝒙𝟏 ; 𝐻1 )
2𝜋𝜎 2 2𝜎 2 𝑛=0
= >𝛾
1 1 𝑁−1 2 𝑝(𝒙𝟏 ; 𝐻0 )
exp − σ 𝑥 𝑛
2𝜋𝜎 2 𝑁 2𝜎 2 𝑛=0
Data 𝐱 𝟐 is infact irrelevant!!!
3.6. Minimum Probability of Error
In some applications we may naturally assign prior
probabilities to the hypothesis. For example, in digital
communication using BSPK or ON-OFF keying, both bits /
hypothesis are equally likely so that P(H0)=P(H1)=0.5. Of
course in radar application this is not possible.
Bayesian approach to hypothesis testing is analogous to
Bayesian Estimation
We can define probability of error to be
𝑃𝑒 = 𝑃𝑟 𝑑𝑒𝑐𝑖𝑑𝑒 𝐻0 , 𝐻1 𝑡𝑟𝑢𝑒 + 𝑃𝑟 𝑑𝑒𝑐𝑖𝑑𝑒 𝐻1, 𝐻0 𝑡𝑟𝑢𝑒
= 𝑃 𝐻0 𝐻1 𝑃 𝐻1 + 𝑃 𝐻1 𝐻0 𝑃(𝐻0 )
where 𝑃 𝐻𝑖 𝐻𝑗 is the conditional probability of deciding Hi
given that Hj is true, a bit different meaning than 𝑃(𝐻𝑖 ; 𝐻𝑗 ).
3.6. Minimum Probability of Error
It can be shown that optimally we need to decide on H1 if
𝑝 𝑥 𝐻1 𝑃(𝐻0 )
> =𝛾
𝑝(𝑥|𝐻0 ) 𝑃(𝐻1 )
Similar to NP test! Just now the probabilities are conditional
and the threshold is given without need to search for it!
⇒ 𝑃 𝐻1 𝑥 > 𝑃 𝐻0 𝑥
𝑅 = 𝐸 𝐶 = 𝐶𝑖𝑗 𝑃 𝐻𝑖 𝐻𝑗 𝑃 𝐻𝑗
𝑖=0 𝑗=0
We can assume C00=C11=0. Now, the detector that minimizes Bayes risk is
𝑝 𝑥 𝐻1 (𝐶10 − 𝐶00 )𝑃(𝐻0 )
> =𝛾
𝑝(𝑥|𝐻0 ) (𝐶01 − 𝐶11 )𝑃(𝐻1 )
which is again using LRT but now with cost-dependent threshold.
3.8. Multiple Hypothesis Testing
For case with more than two hypothesis, NP criterion is rarely used in practise.
Instead we use Bayes risk, now defined as
𝑀−1 𝑀−1
𝑅 = 𝐸 𝐶 = 𝐶𝑖𝑗 𝑃 𝐻𝑖 𝐻𝑗 𝑃 𝐻𝑗
𝑖=0 𝑗=0
where 𝑀 is the number of hypothesis. To minimize this cost we should choose the
hypothesis that minimizes
𝑀−1
𝑇 𝑥 = 𝑥 𝑛 𝑠 𝑛 > 𝛾′
𝑛=0
4.3. Matched Filters
In our previous case of DC level A in WGN, we get
𝑁−1
𝑇 𝑥 = 𝐴 𝑥 𝑛 > 𝛾′
𝑛=0
Assume that A>0, now divide both sides by NA.
𝑁−1
1 𝛾′
𝑥 𝑛 > = 𝛾 ′′
𝑁 𝑁𝐴
𝑛=0
If A<0, the inequality reverses.
𝑁−1
𝑇 𝑥 = 𝑥 𝑛 𝑟𝑛 > 𝛾′
𝑛=0
4.3. Matched Filters
Matched filter can viewed to be correlator or replica-correlator since we correlate data with a
replica of the signal.
An alternative implementation processes the input signal with finite impulse response (FIR) filter
with impulse response
𝑠[𝑁 − 1 − 𝑛] 𝑛 = 0,1, ⋯ , 𝑁 − 1
ℎ 𝑛 =ቊ
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
and samples the output at time n=N-1.
𝑦 𝑁 − 1 = 𝑥 𝑛 𝑠[𝑛]
𝑛=0
which is exactly the same as before! Proof:
𝑖=𝑁−1
𝑦 𝑛 = 𝑏𝑖 𝑥[𝑛 − 𝑖]
𝑖=0
𝑖=𝑁−1 𝑁−1
3.5
3
y[n]
2.5
1.5
0.5
0
0 2 4 6 8 10
n
Best performance when sampled at N-1! Note noise may, at times, distort the
maximum location but N-1 is still best sampling instant.
4.3. Matched Filters: Performance (under WGN)
Let us check the mean and variance for matched filter output
Under H0
𝑁−1
𝐸 𝑇; 𝐻0 = 𝐸 𝑤 𝑛 𝑠 𝑛 =0
𝑛=0
𝑁−1 𝑁−1
Under H1
𝑁−1
𝐸 𝑇; 𝐻1 = 𝐸 (𝑤 𝑛 + 𝑠[𝑛])𝑠 𝑛 =𝜀
𝑛=0
𝑁−1
0.9 1.8
0.8 1.6
0.7 1.4
0.6 1.2
0.5 1
0.4 0.8
0.3 0.6
0.2 0.4
0.1 0.2
0 0
0 2 4 6 8 10 0 2 4 6 8 10
4.4. Generalized Matched Filter
Generalized matched filter handles the case where noise is not WGN but instead
colored Gaussian noise 𝑊~𝑁(0, 𝑪).
To determine the NP detector for this we again use the LRT test
1 1 𝑇 𝑪−1 𝒙 − 𝒔
𝑝 𝒙; 𝐻1 = 𝑁 1 exp − 𝒙 − 𝒔
2
2𝜋 det C
2 2
1 1 𝑇 −1
𝑝 𝒙; 𝐻0 = 𝑁 1 exp − 2 𝒙 𝑪 𝒙
2𝜋 2 det C 2
𝑝(𝑥; 𝐻1 ) 1
𝑙 𝑥 = log =− 𝒙 − 𝒔 𝑇 𝑪−1 𝒙 − 𝒔 − 𝒙𝑇 𝑪−1 𝒙
𝑝(𝑥; 𝐻0 ) 2
𝑻 −𝟏
𝟏 𝑻 −𝟏
=𝒙 𝑪 𝒔− 𝒔 𝑪 𝒔
𝟐
Since the second term does not depend on the data we get the equivalent test
𝑇 𝒙 = 𝒙𝑻 𝑪−𝟏 𝒔 > 𝛾 ′
4.4. Generalized Matched Filter
Let us check that general equation reduces to our previous one for WGN
For WGN, 𝑪 = 𝜎 2 𝑰, for that we get
𝑇 𝒙 = 𝒙𝑻 𝒔/𝜎 2 > 𝛾 ′
𝑁−1
𝑇 𝑥 = 𝑥 𝑛 𝑠 𝑛 > 𝜎 2 𝛾 ′ = 𝛾 ′′
𝑛=0
Same as before!
4.4. Generalized Matched Filter
Let us assume that C = diag(𝜎02 , 𝜎12 , ⋯ , 𝜎𝑁−1
2
). Now we get
𝑁−1
𝑥𝑛𝑠𝑛
𝑇 𝒙 = 𝒙𝑻 𝑪−𝟏 𝒔 = > 𝛾 ′
𝜎𝑛2
𝑛=0
𝑁−1
𝑥𝑛 𝑠𝑛
=
𝜎𝑛 𝜎𝑛
𝑛=0
Under H1
𝑁−1
𝑤𝑛 𝑠𝑛 𝑠𝑛
𝑻(𝒙) = +
𝜎𝑛 𝜎𝑛 𝜎𝑛
𝑛=0
Generalized matched filter prewhitens noise samples and also distorts the signal.
After prewhitening, correlation with distorted signal.
4.4. Generalized Matched Filter
Let us write 𝑪−𝟏 = 𝑫𝑻 𝑫. Now test statistic
𝑻
𝑇 𝒙 = 𝒙𝑻 𝑪−𝟏 𝒔 = 𝒙𝑻 𝑫𝑻 𝑫𝒔 = 𝒙′ 𝒔′
where
𝒔′ = 𝑫𝒔
and
𝒙′ = 𝑫𝒙
To show that WGN is indeed produced, assume 𝒘′ = 𝑫𝒘.
Now,
𝑻
𝑪𝒘′ = 𝑬 𝒘′ 𝒘′ = 𝑬 𝑫𝒘𝒘𝑻 𝑫𝑻
= 𝑫𝑬 𝒘𝒘𝑻 𝑫𝑻 = 𝑫𝑪𝑫𝑻
𝑻 −𝟏 𝑻
=𝑫 𝑫 𝑫 𝑫 =𝑰
4.4. Generalized Matched Filter
𝑇 𝒙 = 𝒙𝑻 𝑪−𝟏 𝒔 > 𝛾 ′
Let us determine the performance of the generalized matched filter
Under H0,
𝑬 𝑻; 𝑯𝟎 = 𝑬[𝒘𝑻 𝑪−𝟏 𝒔] = 𝟎
Under H1,
𝑬 𝑻; 𝑯𝟏 = 𝑬[ 𝒔 + 𝒘 𝑻 𝑪−𝟏 𝒔] = 𝒔𝑻 𝑪−𝟏 𝒔
Under H0,
𝑽𝒂𝒓 𝑻; 𝑯𝟎 = 𝑬 𝒔𝑻 𝑪−𝟏 𝒘𝒘𝑻 𝑪−𝟏 𝒔 =
= 𝒔𝑻 𝑪−𝟏 𝑬 𝒘𝒘𝑻 𝑪−𝟏 𝒔 = 𝒔𝑻 𝑪−𝟏 𝒔
Under H1 it can be shown that also
𝑽𝒂𝒓 𝑻; 𝑯𝟏 = 𝒔𝑻 𝑪−𝟏 𝒔
4.4. Generalized Matched Filter
Now under H0
𝜆
𝑃𝐹𝐴 = 𝑄
𝒔𝑻 𝑪−𝟏 𝒔
𝜆 = 𝑄 −1 𝑃𝐹𝐴 𝒔𝑻 𝑪−𝟏 𝒔
𝜆 − 𝒔𝑻 𝑪−𝟏 𝒔
𝑃𝐷 = 𝑄
𝒔𝑻 𝑪−𝟏 𝒔
𝑄 −1 𝑃𝐹𝐴 𝒔𝑻 𝑪−𝟏 𝒔 − 𝒔𝑻 𝑪−𝟏 𝒔
=𝑄
𝒔𝑻 𝑪−𝟏 𝒔
= 𝑄 𝑄−1 𝑃𝐹𝐴 − 𝒔𝑻 𝑪−𝟏 𝒔
Before, only signal energy mattered. Now, the signal shape also matters!
=> Design signal shape (for given energy) to maximize PD!
4.5. Multiple Signals: Binary case
Let us assume that now instead of detecting if a known signal is present or not,
the problem is detecting which signal was sent. For example in communication
system we should find out which out of M signal was sent.
𝐻0 : 𝑥 𝑛 = 𝑠0 𝑛 + 𝑤 𝑛 , 𝑛 = 0,1, ⋯ , 𝑁 − 1
𝐻1 : 𝑥 𝑛 = 𝑠1 𝑛 + 𝑤 𝑛 , 𝑛 = 0,1, ⋯ , 𝑁 − 1
Let us use minimum error probability criterion. We decide H1 if
𝑝(𝑥|𝐻1 ) 𝑃(𝐻0 )
>𝛾= =1
𝑝(𝑥|𝐻0 ) 𝑃 𝐻1
This is the ML rule. By using the definition of the multivariate Gaussian we get that
we select the hypothesis 𝑖 for which
𝑁−1
𝐷𝑖2 = 𝑥 𝑛 − 𝑠𝑖 𝑛 2
𝑛=0
is minimum. We can write
𝐷𝑖2 = 𝒙 − 𝒔𝑖 2
So we choose the hypothesis whose signal vector is closest to 𝒙.
4.5. Multiple Signals: Binary case
We select the hypothesis 𝑖 for which
𝑁−1
𝐷𝑖2 = 𝑥 𝑛 − 𝑠𝑖 𝑛 2
𝑛=0
is minimum. We can write this as
𝑁−1